MariaDB & SkySQL DBAAS

This is a getting started guide for MariaDB SkySQL. Let’s start with two prerequisites definitions:

  1. MariaDB – MariaDB is an open-source relational database management system (RDBMS) that is a fork of MySQL, another popular open-source database system. It was created by the original developers of MySQL after concerns arose about the acquisition of MySQL by Oracle Corporation in 2010. MariaDB is designed to be a drop-in replacement for MySQL, which means that many applications and tools developed for MySQL can also work seamlessly with MariaDB without requiring significant changes. It retains much of the same syntax, APIs, and commands as MySQL, making the transition relatively straightforward for users.
  2. MariaDB SkySQL – MariaDB SkySQL is a cloud-native Database as a Service (DBaaS) offering provided by MariaDB Corporation, the company behind the development of the MariaDB open-source database system. SkySQL is designed to simplify database management, deployment, and scaling by providing a fully managed and highly available MariaDB database solution in the cloud.

Some key features of MariaDB include:

  1. High Performance: MariaDB incorporates optimizations and improvements to enhance query execution speed and overall performance.
  2. Storage Engines: MariaDB supports multiple storage engines, including the popular InnoDB and Aria engines. Each engine has its own characteristics and performance attributes, allowing users to choose the one that best fits their requirements.
  3. Security: MariaDB includes various security enhancements, such as data encryption at rest and in transit, improved authentication methods, and better access control mechanisms.
  4. Open Source: MariaDB is fully open source, which means its source code is available for anyone to inspect, modify, and contribute to.
  5. Community and Development: MariaDB has a vibrant and active community of developers and contributors who work on its continued development and improvement.
  6. Compatibility: As mentioned earlier, MariaDB aims for compatibility with MySQL, allowing applications developed for MySQL to work with minimal changes.
  7. Extensions: MariaDB introduces some features not present in MySQL, such as the Aria storage engine, thread pooling, and more advanced geographic information system (GIS) functionality.
  8. Replication and Clustering: Like MySQL, MariaDB supports various replication methods and clustering solutions for high availability and fault tolerance.
  9. Plugins: MariaDB offers a plugin architecture that allows users to add custom functionality and features to the database system.

To elaborate further on the specifics of MariaDB SkySQL, here are some of the features of the DBAAS (DataBase As A Service):

  1. Managed Service: SkySQL takes care of database administration tasks such as provisioning, backup, monitoring, maintenance, and security updates. This allows users to focus more on their applications and less on managing the underlying database infrastructure.
  2. High Availability: SkySQL offers built-in high availability configurations that ensure database uptime and data durability. This includes automatic failover and replication setups.
  3. Scalability: SkySQL supports both vertical and horizontal scaling. Vertical scaling involves adjusting the resources of a single database instance, while horizontal scaling involves distributing data across multiple nodes for improved performance and capacity.
  4. Security: Security features such as encryption at rest and in transit, role-based access control, and network security protocols are integrated to help protect sensitive data.
  5. Multi-Cloud Support: SkySQL is designed to work across various cloud providers, enabling users to choose the cloud environment that best suits their needs. It supports popular cloud platforms like Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP).
  6. Compatibility: SkySQL maintains compatibility with the MariaDB database, which means applications developed for MariaDB can run seamlessly on SkySQL with minimal modifications.
  7. Global Distributed Architecture: SkySQL offers the capability to deploy databases across multiple geographic regions for improved performance and data availability across different parts of the world.
  8. Managed Upgrades: Regular updates and improvements to the MariaDB database engine are managed by the SkySQL service, ensuring that your databases remain up to date without manual intervention.
  9. Pay-as-You-Go: SkySQL’s pricing model is typically based on usage, allowing users to pay for the resources they consume. This can be cost-effective for businesses as it eliminates the need to invest in and maintain dedicated database infrastructure.

Prerequisites

Before going any further, let’s check box some prerequisites for getting started. For any database I work with I like to pick up the native tool to work with that database. For some databases, like SQL Server, that might be a specific user interface application or others it may be a CLI. In the case of MariaDB, the default native way that many connect to MariaDB is via the mariadb CLI. For each installation method per your specific operating system, I’ll leave it to you dear reader to choose the one that fits your needs. The number of installation resources out on the web for MariaDB, specific to one’s operating system (Windows, Linux, or MacOS) are pretty extensive. The installation guides that I’ve found useful in the past, and some that have been created recently are pretty nice, which I’ve included here:

  • Digital Ocean’s “How to Install MariaDB on Ubuntu 20.04” is a full installation guide, but a great way to get a first install running. Which in turn, will ensure you have the mariadb CLI installed.
  • To use brew to install MariaDB on MacOS check out these docs here.
  • Digital Ocean has a page specific to MariaDB that lists out all of their installation guides here.
  • To get up and running on Windows, with a nice wizard installation, check out the docs page here on installing with an MSI package.

Final Prerequisite Checklist

  • Base OS Load and IDE’s, like this.
  • CLI mariadb installed.
  • Jetbrains Datagrip installed, one of the default I always install.

MariaDB SkySQL Getting Started

Click on the image for a giant full size one.

At this point, sign up for an account and setup an instance via the “Launch a Cloud Database” page. Here’s a nice long screenshot, with my choices for the instance I wanted all chosen.

Selected Enterprise Server since that would provide the basics of what I want to try out. Next up, a single node would serve well. Then the Oregon, USA us-west-2 location of AWS’s. Went with the default AMD64 with a Sky-2×4 instance with standard gp2. I also left the transactional storage size at 100GB.

Next up I selected the Enterprise Server 10.6.14-9, the recommended version. Gave it a most excellent service name and wrapped up by selecting to allow the service IP. I did learn, for quick access you want select the current IP to add.

Click for full size image.

Finally, I needed to “Add payment method & launch”. Clicking that brought up the “Add a new credit card” dialog. I filled this out and the “save” button isn’t active. Before getting the save button activated I had click on the validate phone number button. With that completed the instance started launching. This can take a few minutes in my experience.

Instance launching.
The instance now launched with a nice status dashboard.

Once the instance is up and running, navigate to the settings (left hand side) and open up that screen to setup Secure Access.

In this dialog, since I didn’t check the “add IP address” previously, I’ll add it here.

I added the IP this way instead of selecting the check box previously for two reasons.

  1. I wanted to make sure I went and found out where I could add IPs to the allowlist. My IP will be changing, as they tend to do, and I want to know exactly where I’ll have to go to add additional IPs when my IP changes. In addition, if I travel, the same situation would arise where I would need to add a new IP wherever it is that I am at.
  2. The second reason is even simpler, I wanted to figure out where this was in the interface and see how the interface was designed. It’s a pretty straight forward interface and even enables you to just click and it will give you your IP. At least, the IP that it can identity via the browser you’re using. If you’re using a VPN or under some other networking setup this might not be accurate, but it’ll cover 99% of the cases.

Once I added the IP, I navigated back to this screen. Over to the far right is the “manage” button which provides many of the standard actions one would have to take to start, stop, or take other actions against the database instance.

Next to that is the “connect” button which is key to the next steps I want to take. On this modal dialog I have all the details, along with the *.pem to download, to make a connection with the various tools I want to connect with.

Since the CLI connection details are on the dialog, I’ll connect that way first. The first step is to download the *.pem key. The cool thing is the commands on this dialog are setup to use the *.pem key right from your downloads directory. So you can just download it right to where it goes (by default that is, if you’ve changed this note where it will be, you’ll need to make that change). With that downloaded we’re ready for the next steps.

Getting Connected via mariadb CLI

Here (in the above terminal image) I’ve executed the mariadb --host xyz.skysql.mariadb.com --port 3306 --user theUserHere -p --ssl-ca ~/Downloads/skysql_chain_2022.pem command. I then enter the password for my account, the command via the passed *.pem file then ssl connects me to the database and I’m in.

The next command the connect dialog provides is to set the password for the database account to something that isn’t the default generated password.

Here I copied my first attempt to change the password and noted there are complexity rules applied to the password. Simple enough, added some good characters (special and all), and that met the check.

Now with the security needs met I was ready to do some database things! First step, I wanted a database on my database server! But I was also curious, what databases already exist on the server and could I view them with a SHOW DATABASES; command? Sure enough, information_schema, mysql, performance_schema, sky, and sys are there. These are the system tables that the database needs, and I presume the sky database is probably SkySQL related.

Now that I know what databases exist, I wanted to get my own created so issued a minimal create database command.

CREATE DATABASE adrons_epic_db;

That completed successfully so I followed it with another `SHOW DATABASES;` but with the added `LIKE` to it to show just the database I created.

I presumed next up I’d need to switch to that database to issue commands against it. I followed that command with a create table command to get something to put data into.

With a table created I could now finally store some data in the database. With a simple insert statement I was able to add a row to the table. Since I’d not set a default value for the id I also inserted it, normally I’d set a default and just let it auto create the value here.

I followed that up. with a select to see the data I just added. You can also see I pulled dyslexic maneuver on the word “railraods” which is a great reason to check out something that can autocomplete like DataGrip. Which just happens to be the next tool I wanted to connect!

Last action however, closing out the connection with exit. It’s always a good practice to close out of things properly, so that connections aren’t left hanging.

Using Jetbrains DataGrip

DataGrip is a great tool from JetBrains for working against a huge number of databases. With any new database I like to get DataGrip wired up as one of my primary tools. I’ll show you how I went about doing that with this new MariaDB SkySQL database now. Pay close attention, as there are specific pecularities to getting this to work.

First step was simply starting a new project.

Next up the connection itself. From the long list of databases I found and selected MariaDB.

Next up on the connection, I went with the default name it spit out by pasting in the “Host”. That value comes from the connection dialog, which if you note, there are copy icons which will copy the specific text into your copy paste cache. It’s a good idea to use those instead of trying to select the dialog content directly.

With the host set I also needed to download the driver, which I did using the download link, which in this image is replaced with the “Test Connection” link. With these settings made, then I clicked on the SSH/SSL tab of the connection dialog. On this tab of the dialog I clicked on “Use SSL”, used the select button to navigate via the file browser to the location of the certificate file. I then clicked on “Test Connection” and got an error.It seemed I’d done everything correct, but the Mode needed to be set to “Verify CA”. Once I did that, the connection worked.

Click on Server Objects and choosing a new workspace I could then issue SQL commands, just as with the CLI before.

A quick `SHOW DATABASES` gave me a list of databases in the results below, and I went ahead and issued a USE database command, and as you can see the autocomplete kicked off immediately! The glory and ease of development when you’ve got autocomplete!

I also mocked going through and creating a table just because I wanted to snag a screenshot showing the details of how you can go in and create a table with DataGrip. It’s sometimes a clunky interface but its also very powerful. It’s easy to create table and other elements of the database without remembering the esoteric nuances of the create syntax, all while seeing the SQL syntax in the preview below the form.

Just for good measure, here’s the dialog querying against that same table I created earlier with the CLI. With all that, a solid getting started tour, and I’m now setup to start development against my database. All that, and more, in subsequent posts.

So if you’re looking for a MariaDB option, MariaDB SkySQL is a very good option. It’s easy to get started, and it’s easy to use. With – and this makes me happy – decent starting security! I’m looking forward to using it more in the future.

Shortlist of Database as a Service Providers

Some top database providers for various open source databases like MariaDB, PostgreSQL, MongoDB, Apache Cassandra, Redis, Elasticsearch, and Neo4j:

  1. MariaDB:
  2. PostgreSQL:
  3. MongoDB:
  4. Apache Cassandra:
  5. Redis:
  6. Elasticsearch:
  7. Neo4j (Graph Database):

Top 10 GraphQL Anti-patterns IME “The Horror”

While GraphQL provides a flexible and powerful approach to building APIs, there are some common anti-patterns that developers may unintentionally implement when working with GraphQL query resolvers. These anti-patterns – the opposite of yesterday’s top 10 practices – can lead to issues such as performance bottlenecks, security vulnerabilities, or maintenance difficulties. Here are some of the top anti-patterns to avoid:

  1. N+1 Problem: The N+1 problem occurs when resolver functions trigger additional database queries within a loop or for each item in a list. This can result in a large number of database queries, leading to poor performance. Implement data batching techniques using tools like DataLoader to mitigate this issue, to learn more about DataLoader, check out this post.
  2. Over-fetching and Under-fetching: Over-fetching happens when a resolver fetches more data than the client actually needs, resulting in unnecessary data transfer and increased response size. On the other hand, under-fetching occurs when the resolver does not provide enough data to fulfill the client’s request, leading to additional round trips. Design your resolvers carefully to strike the right balance and only fetch the required data.
  3. Resolver Fatigue: Resolver fatigue refers to a scenario where a single GraphQL resolver is responsible for handling a large number of fields or complex logic. This can make the resolver codebase difficult to maintain, understand, and test. Break down your resolvers into smaller, more manageable units to avoid resolver fatigue.
  4. Deep Nesting: GraphQL allows for nested queries, but excessive nesting can lead to performance issues. Deeply nested queries may result in complex resolver logic and multiple database queries. Try to flatten your schema structure and optimize resolver logic to avoid unnecessary complexity.
  5. Lack of Caching: Not implementing caching mechanisms in your resolvers can result in repeated and costly data fetch operations. Introduce caching strategies, such as in-memory caching or distributed caches, to store frequently accessed data and reduce the load on your data sources.
  6. Inefficient Pagination: Pagination is commonly used in GraphQL to handle large datasets. Implementing pagination incorrectly can lead to performance issues and inefficient querying. Use appropriate pagination techniques, like cursor-based pagination, to efficiently retrieve and display data. To read more details on pagination and how it can be applied to GraphQL queries check out this post.
  7. No Rate Limiting: Without proper rate limiting mechanisms, your GraphQL API may be susceptible to abuse and DoS attacks. Implement rate limiting at the resolver or API level to control the number of requests and protect your server resources.
  8. Lack of Input Validation: Failing to validate and sanitize user input can lead to security vulnerabilities, such as SQL injection or unauthorized data access. Validate and sanitize input parameters in your resolvers to prevent these risks.
  9. Monolithic Resolvers: Creating monolithic resolvers that handle multiple unrelated responsibilities can lead to code duplication, reduced reusability, and increased maintenance effort. Follow the single responsibility principle and modularize your resolvers to improve code organization and maintainability.
  10. Insufficient Error Handling: Inadequate error handling in resolvers can result in unhandled exceptions or unclear error messages returned to the client. Implement comprehensive error handling and provide informative error messages to assist client developers in troubleshooting and debugging. For more details on error handling, check out this post.

By avoiding these anti-patterns and following established best practices, you can enhance the performance, security, and maintainability of your GraphQL query resolvers.

10 Best Practices IMHO for GraphQL

Here are 10 best practices for GraphQL accrued from dozens of GraphQL API implementations:

  1. Keep your schema simple: Design your GraphQL schema with a clear and concise structure. Avoid unnecessary complexity and keep it focused on the specific requirements of your application. One great idea is to implement consistent standards to keep your schema simple, read more about those ideas here.
  2. Think about the client’s needs: GraphQL allows clients to specify their data requirements precisely. Collaborate with the client-side developers to understand their needs and design your schema accordingly, minimizing over-fetching or under-fetching of data.
  3. Version your schema: As your application evolves, consider versioning your GraphQL schema to ensure backward compatibility. This allows you to introduce changes without breaking existing client implementations.
  4. Use precise field names: Choose field names that accurately describe the data they represent. Be consistent with your naming conventions and avoid ambiguity to enhance the readability and maintainability of your schema.
  5. Avoid excessive nesting: While GraphQL supports nested queries, avoid deep nesting of fields as it can lead to performance issues and over-fetching of data. Optimize your schema by flattening nested fields when possible.
  6. Implement proper authentication and authorization: GraphQL does not enforce any specific authentication or authorization mechanisms. It is crucial to implement appropriate security measures to protect your GraphQL API endpoints, such as using authentication tokens, access control rules, and rate limiting.
  7. Implement pagination for large datasets: When dealing with large datasets, use pagination techniques (e.g., cursor-based pagination) to efficiently fetch and display data. This helps in improving performance and reduces the load on both the server and the client. For details on paging patterns and implementation details check out this article.
  8. Utilize data loaders: GraphQL data loaders help optimize data fetching by batching and caching requests. Implement data loaders to avoid the N+1 problem, where multiple database queries are triggered for each item in a list. Check out this post for more details.
  9. Document your schema: Provide comprehensive documentation for your GraphQL schema to assist client developers in understanding the available types, fields, and their usage. Clear documentation and standards promotes developer adoption and simplifies integration. Check out this post for more details on GraphQL standards.
  10. Monitor and optimize performance: Regularly monitor and analyze the performance of your GraphQL API. Identify and optimize slow-performing queries, implement caching strategies, and leverage tools like Apollo Engine or persisted queries to improve overall performance.

Remember that these practices may vary depending on the specific requirements and context of your GraphQL implementation. It’s always recommended to stay updated with the latest best practices and community guidelines.

Engineering API Governance

I’ve written this post to provide a detailed look at what a role around API Governance, or simply the functions of API Governance, would be centered around.

Why does this role exist?

An API Governance function or role would be responsible for overseeing the governance of APIs within an organization. It would center around ensuring that APIs are developed, implemented, and maintained according to established standards, best practices, and business requirements.

Here are the primary reasons why this type of work and role exist:

  1. Growing Importance of APIs: APIs have become the backbone of modern software development and integration. They enable different applications and systems to communicate and share data seamlessly. As organizations increasingly adopt microservices architecture and cloud-based solutions, the number and complexity of APIs they use grow significantly.
  2. Consistency and Standardization: In large organizations with multiple development teams and projects, maintaining consistency in API design, implementation, and usage becomes challenging. An API governance role ensures that APIs adhere to standardized guidelines, leading to better interoperability and reusability.
  3. Security and Compliance: APIs expose access points to an organization’s systems and data. Ensuring the security of these access points is critical to prevent data breaches, unauthorized access, and other security risks. An API governance role focuses on implementing robust security measures and compliance with relevant regulations.
  4. Efficient Collaboration: When different teams create APIs independently, they may not be aware of existing solutions or may duplicate efforts. An API governance function facilitates collaboration, knowledge sharing, and reuse of APIs, leading to more efficient development processes.
  5. Scalability and Performance: Proper governance includes performance monitoring and optimization of APIs. This helps identify bottlenecks, improve response times, and ensure that APIs can handle increasing workloads as the organization grows.
  6. User Experience: APIs are used by developers, both internally and externally, to build applications. A well-governed API ecosystem includes clear and comprehensive documentation, which enhances the developer experience and enables faster integration with APIs.
  7. Risk Mitigation: By setting up a structured governance process, organizations can identify and address potential risks associated with APIs early on, reducing the chances of costly issues arising in production.
  8. API Lifecycle Management: APIs have a lifecycle that includes planning, development, versioning, and deprecation. Proper governance ensures that APIs are maintained and updated according to their lifecycle stages, preventing the accumulation of outdated or obsolete APIs.
  9. Compliance with Business Strategy: API governance aligns API development and management with the organization’s overall business strategy, ensuring that APIs support the company’s goals and objectives effectively.
  10. Adoption of Best Practices: An API governance role stays updated with industry trends and best practices, continuously improving the organization’s API strategy and development processes.

Engineering API Governance Primary Functions

The API Governance function or role is responsible for overseeing and managing the governance of Application Programming Interfaces (APIs) within an organization. It ensures that APIs are developed, implemented, and maintained according to established standards, best practices, and business requirements. Here are some key aspects of an API Governance function or role:

  1. Defining API Standards: The API Governance team establishes and documents API design standards, naming conventions, versioning practices, security guidelines, documentation requirements, and other relevant policies. These standards promote consistency and facilitate seamless integration among different APIs.
  2. API Lifecycle Management: The API Governance function oversees the entire lifecycle of APIs. This includes planning and design, development, testing, deployment, monitoring, version control, and eventual retirement or deprecation when necessary.
  3. Security and Compliance: Ensuring the security of APIs is a critical aspect of the API Governance role. The team establishes security protocols, access controls, authentication mechanisms, and data protection measures to safeguard APIs and the systems they interact with. Additionally, they ensure compliance with relevant industry regulations and data privacy laws.
  4. Documentation and Communication: API Governance teams create comprehensive and easily accessible documentation for APIs. This documentation helps internal and external developers understand how to use the APIs effectively and provides details about their capabilities, limitations, and potential use cases.
  5. Monitoring and Performance: The API Governance function sets up monitoring systems to track API usage, performance metrics, and error rates. This data helps identify potential issues, bottlenecks, and areas for improvement. It allows the team to optimize APIs to meet service-level requirements and user expectations.
  6. Collaboration and Coordination: The API Governance role involves collaborating with various teams, including development, product management, security, and operations. Effective coordination ensures that APIs are aligned with business objectives and developed in a way that meets the needs of different stakeholders.
  7. Review and Approval Process: The API Governance team establishes a review and approval process for new APIs and changes to existing ones. This process ensures that APIs meet the required standards, security measures, and compliance before they are deployed.
  8. Education and Training: The API Governance function may conduct training sessions for developers and other stakeholders to promote best practices in API development, usage, and maintenance.
  9. Adoption of API Management Tools: API Governance teams often utilize API management tools to facilitate API governance tasks. These tools can assist in monitoring, versioning, security, analytics, and documentation management.
  10. Continuous Improvement: The API Governance function remains updated with industry trends and best practices to continuously improve the organization’s API strategy and governance approach.

In a large enough organization even these individual functions become individually staffed work. However, a single API Governance staff member often would be tasked with these items and would need to delegate and organize the priority of the various functions throughout the organization.

In following posts (which I’ll include here once they’re posted) I’ll write up some scenarios, from working as an individual contributor and leader (i.e. managing staff) as well as hiring for API Governance roles. I’ll get into the nitty gritty of how process, practice, and patterns can be used to elaborate on things like education, adoption of API management tools, continuous improvement and the many other functions.

With that, subscribe for updates, and you’ll get any new posts direct to your inbox! (check side bar for subscribing)