Author: Emily Lahren (page 4 of 4)

Do DML Statements Work in Liquibase Changesets?

After finishing the blog post last week about how to work with Liquibase, I decided to find the answer to one of the outstanding questions I had about the tool, which was whether or not it allows you to put DML statements in your changelogs and changesets. I couldn’t find any documentation anywhere online about putting DMLs in changesets, so I had to figure it out myself. Finding the answer to this was much easier than I thought it would be since all it involved was adding a DML statement to a changelog, running the Liquibase update statement, and then seeing what happened.

So do DMLs work in Liquibase changesets?

Yes, they do. To prove it, I opened the existing changelog file that I created for last week’s tutorial and I added a new changeset.

Screenshot of text editor containing Liquibase changeset with DML statement to insert into a table

Then I opened the command prompt for Liquibase and ran the normal update statement to get my database aligned with the changelog file.

Screenshot of Liquibase command window showing successful execution of DML changeset

The update statement completed successfully, which I truly was not expecting. Then I had to go into the database to see if that DML statement was actually executed on the DB or not, and it was!

Screenshot of PGAdmin window showing SELECT statement results containing record inserted by DML changeset

I was very excited to see that, because it meant that if my team decided to switch to this tool, we could continue deploying DML scripts alongside any DDL scripts they may be associated with.

Summary

Today’s post is short and sweet. I wanted to see if the Liquibase tool had a key feature I was looking for it to have but couldn’t find documentation about. I was thrilled to see that it does work with DMLs. Such a small but important feature.

How to Set Up and Use Liquibase, Part 2

In last week’s post, I covered the initial setup steps you must follow when starting to work with Liquibase. In this week’s post, I will be finishing up my tutorial of getting started with Liquibase. If you haven’t yet downloaded and set up Liquibase on your computer, please review that post before reading this one.

What’s in this post:

Create the baseline changelog file for your database

Using the command “generate-changelog” with the CLI for Liquibase, we can create a SQL file containing queries that will regenerate all objects in your database. What database objects get scripted into this files depends on which license you have for Liquibase. If you have the open-source version of the tool, it will script out all non-programmable objects like tables, keys, etc. If you want or need to script out all of your programmable objects such as procedures and functions (plus other items), you will need to have the Pro version of the tool.

Either way, the command for creating the script is exactly the same.

liquibase generate-changelog --changelog-file=mydatabase-changelog.sql --overwrite-output-file=true

Let’s break this command down. The first two words are simple, you’re calling Liquibase and specifying you want it to run the generate-changelog command. The next part is the “changelog-file” argument that allows you to specify the file you want to write the new changelog to. The next argument, “overwrite-output-file” tells the tool if you want to overwrite that specified file if it already exists. In this case, I specified true for that argument because I want the tool to overwrite the example changesets in the file it created upon project creation with the actual queries for my database. After running this command, you should get a success message like the following.

And if you open that specified file now, it should contain the actual scripts to generate all of the objects in your database, each change separated into its own changeset. Each generated changeset will be defined with the username of the person who generated the file, as well as the tracking/version number for the set.

Now you are ready to start doing normal development and changes to your database because you have baselined your project.

Adding and tracking ongoing database changes

There are two methods for adding/tracking database changes with this tool: 1) add your scripts to the changelog file as changesets, then “update” the database with those changes, or 2) make your changes within the IDE for your database (ex: PGAdmin) then use the “generate-changelog” command to identify and script those changes.

Method 1: Adding Scripts to Changelog File

Open your changelog file and add a new line. On that line, you are going to add the required comment that lets Liquibase know you are starting a new changeset. This line looks like “– changeset author:versionNumber”. Example: “– changeset elahren:1.1”. Then, add a line below your changeset comment and add the DDL script you would like to run on your database. Once you have added all the changes you would like to your changelog file, save and close the file, then open the Liquibase command prompt to execute those changes on your database.

If you would like to preview the changes Liquibase will run on your database, you can run the command “liquibase update-sql” which will show you all the SQL that will be executed, which will include your queries as well as queries Liquibase will run to track what you’re applying. In the below screenshot, the commands with a green square are the ones I included in my changesets, and the commands with a blue square are the ones that Liquibase will run to track the changes.

If the preview looks correct, you can then run the command “liquibase update” which will apply all the previously viewed SQL queries to your database. You can verify the changes have been successfully applied by opening your database in your normal IDE (e.g. PGAdmin) and confirm the changes.

Method 2: Make Changes in your IDE

The process for making the changes in your IDE and then tracking those changes in Liquibase is almost exactly the same as the process we used to create the initial changelog file when setting up the project. It is as easy as making whatever database changes you want in your IDE and then opening the Liquibase CLI and running the “generate-changelog” command with either a new file name if you want to put it in a new changelog file, or use the same file name with the “--overwrite-output-file=true” argument.

If you are going to use the first option, writing to a new changelog file, it seems like you will then need to edit the file after creating it to remove any of the queries you didn’t create in your latest changes (since the command will try to recreate all objects in your database).

I’m not sure if this is the recommended workflow for tracking database changes, but it was a feature my team was hoping to get from the database change tracking tools we’ve been investigating, so I found a way to make it happen with Liquibase. If you want or need to have a “database-first” approach to change tracking (making changes directly to the database and then generating files to track that), instead of a “migration-first” type approach (making migration/change scripts and then applying that to your database), it appears that is technically possible with this tool.

Structuring your changelogs according to best practices

You can set up and structure your changelogs in any way that you would like, it’s your project, but Liquibase does have some ideas to help you stay organized. There are two different organization methods they recommend: object-oriented and release-oriented.

Object-oriented means you will create a different changelog file for each object or type of object being tracked in your database (e.g. one file for stored procedure changes, one file for table changes, etc.). I personally don’t like the idea of this organization method since it would mean you could be updating many files each time you make database changes, like if you’re updating procedures, tables, indexes, and views all for one release. However, having all the object types separated could also be a benefit, depending on how you normally complete your work.

Release-oriented means you make a new file for each release you make for your software/database. This method seems more familiar to me personally since it’s similar to the concept of migration scripts in Red-Gate’s SQL Change Automation or Flyway tools, where you can combine multiple database changes you’re making at once into a single file and deploy it all at once. This process could also work for organizations that use a more structured delivery system and not continuous delivery/agile development. That way you could put all of your changes for the week, month, or whatever development length into one file to associate with one particular release.

Whichever method you choose should work well if you set it up properly, the decision of which option to choose only depends on how you work and how you prefer to track that work.

Outstanding Questions

The first outstanding question I have about this tool right now is can you put DML scripts in your changelogs? That is something supported by Red-Gate’s SQL Change Automation and Flyway tools, which is what I’m used to. So far, I haven’t been able to figure out if that’s possible with Liquibase. Being able to deploy DML changes alongside a regular deployment really simplifies the process of some DMLs that you may need to run in your production environment, because it makes sure they go out with the deployment they are related to. An example of this is if you are adding a lookup type table (i.e. AccountTypes) and need to add the few records into that table after it’s created. Normally, you would need to run such a DML script manually after your deployment has completed. But SCA and Flyway allow you to put the DML in a deployable script that will automatically insert that data after the table is created. That of course can come with its own challenges, but it’s something I’ve really enjoyed with Red-Gate SQL Change Automation so I want it to be possible with Liquibase.

The second outstanding question I have about Liquibase is whether or not it can work with a secrets manager for database user passwords. How I set up my test project locally required me to put the password for the database user for Liquibase to be saved in cleartext in the properties file, which is not safe. For my purposes, it was fine since it’s a dummy database that doesn’t have any real data in it. But for production purposes, there is no way we would ever save a database user password in cleartext in a file. I haven’t had the chance to research this question more yet, so I’m not sure if the tool would work with a secrets manager or not.

Summary

When I first started working with Liquibase I was pretty frustrated with it since it was a totally new-to-me tool with documentation I didn’t find intuitive. But I kept working with it because I wanted to make it work for my organization and then just found it interesting to learn more about. I now feel fairly confident working with the tool in the ways my organization would need.

For being a tool with a completely free-to-use version, it seems like it has a good amount of features that many developers might need and could use for tracking and deploying changes to their databases. I can’t honestly say that I would prefer using this tool to Red-Gate’s SQL Change Automation or Flyway tools, which I currently work with, since they have a better use interface and seem to have more intuitive script creation processes. But Liquibase does seem like a useful tool that I could get used to working with given enough time. It’s worth a try to work with if your organization is working with a limited tool budget.

How To Set Up and Use Liquibase, Part 1

In a recent post, I gave an overview of what Liquibase is and what features it offers as a bare minimum. This week and next, I am going to give the step-by-step instructions I followed myself to learn how to properly set up a Liquibase project after playing with it for several hours. I personally found the documentation offered by Liquibase a little confusing, so this post is essentially the notes I took while figuring out what I really needed to do to set up a demo project with the tool. The aim of this post is not to be an exhaustive tutorial of the software, since I am far from an expert of Liquibase. Let me know in the comments if you found any of this useful or interesting!

What’s in this post:

Download Liquibase and install it on your computer

I won’t provide a link to it here (because that would be sketchy), but you can find the tar or Windows installer download for the free (open-source) version of Liquibase on the Liquibase website under Editions & Pricing > Open Source. I used the Windows installer version since I am working on a Windows machine. After you download the installer, you can run it to install the tool. I did not change any of the setup options (there were very few). The installation and setup were both extremely fast.

After the installer runs, you should be able to see where it was downloaded, which for me, was under Program Files. Now you can start working from that downloaded folder for the program, or you can copy the entire folder to another directory so you can play around with it without the fear of breaking something and then having to reinstall everything to fix issues. I made a copy in another location and worked from that copy (which was suggested by one of the Liquibase tutorials).

Verify that Liquibase is properly installed on your computer

The next important step is to verify the status of the tool to make sure it installed correctly. To do this, you want to open a command prompt window and navigate to the directory of the liquibase folder that you want to work with, then run the command “liquibase status“.

As you can see in the above screenshot, although I had an error returned by the “status” command (since I haven’t setup a Liquibase project yet), the tool did run and work (as evidenced by the giant Liquibase printout). Download and installation of the tool was successful.

Create your first Liquibase project

Creating your first project is simpler than I originally thought it was. I knew that at a bare minimum, a file called “liquibase.properties” needed to be created, but I thought that I had to do that manually (mostly because I did skip a page or two in the documentation that I thought weren’t needed). Although it is possible to create that file manually and then manually enter the necessary values to create a project, the easiest way to setup a new project and all of its necessary files is to use the “init” command, “liquibase init project“.

That command will run you through the process of setting up a new project, including setting the necessary values for the liquibase.properties file. If you are fine with using all default settings for your project, specify “Y” when prompted, otherwise specify “C” which will allow you to customize the values used by the project so you can work with your own database.

What I specified for each of the setup prompts:

  • Use default project settings or custom? Custom, “C
  • Relative path for the new project directory? In the main Liquibase folder, not in a subfolder, “./” (I would likely change this for an actual development project if I was creating one for work)
  • Name for the base changelog file?mydatabase-changelog“, but you can set it to whatever makes the most sense to you. I definitely wouldn’t use that format for an actual work project if we move forward with this.
  • Changelog format?sql” since I am working with a normal SQL database
  • Name of default properties file?liquibase.properties“, the recommended name, but you could change it to something else
  • JDBC URL to the project database?jdbc:postgresql://localhost:5432/postgres“. This points to the database called “postgres” on my local machine, which is a PostgreSQL database so it uses that type of connection string that I specified. Other database engines will have their own connection string format.
  • Username to connect to database? “Postgres”, the default user for a new PostgreSQL database, which is what the project should use to interact with the database. You should change this to a better-defined user just for the Liquibase tool.
  • Password? The password associated with the specified user. For real-world purposes, I hope we’ll be able to use a secrets manager tool to update that file for deployment purposes so that we don’t have to specify that in clear text in the file.

New Project Files

After you go through all of the above steps, Liquibase will have created several new files for you. I am not sure what most of them are for at this point, but I think some of the other files besides the properties file that were created are used for CI/CD integration (flow files). The properties file I created for my project now looks like this after my project setup:

If you have a license key for the enterprise/Pro version of the software, you can scroll to the bottom of this file and uncomment the line “liquibase.licenseKey:” then add your license key after the colon on that line.

You can also review the default changelog file created by Liquibase if you specified that during project initialization. For me, the file contains some sample changesets that don’t relate to my actual database:

These are the tables in my database for the Liquibase project:

To set up this file to represent your database, we’ll need to use the “generate-changelog” command on the project, and I’ll walk through that in the next post.

Database changes made with project initialization

As you can see from the last screenshot above showing what tables I have in my database, there are two tables called databasechangelog and databasechangeloglock. Those two tables will be created in your project database when you connect Liquibase to the DB because that is how the tool keeps track of what changelogs and changesets have been applied to the database already. You can also prevent some changes from being executed when the changelog is applied, and those changes will be tracked with the “lock” table.

You are now ready to start making changes to your database and tracking those changes with Liquibase!

A quick note about changelogs and changesets

The changelog is the main file (or files) that contain the queries you use to interact with your project database. You can have just one or you can have a series of changelogs, each used for a different part of your database.

Each changelog file contains what Liquibase calls the changeset, which is a single unit of change for your database, like an ALTER TABLE statement or any other DDL statement you can run against a database. A changeset is identified in the changelog file with a comment line which will contain the change’s author and then a change number which can essentially be any number you would like to track the changes. If you use the above steps to have Liquibase create your first changelog file for you, it will create a randomly generated version number for each of the queries listed in the file. This number will determine in what order the changes will be applied.

Summary

In this post, I covered how you go about setting up your first Liquibase project and what each of commands and files related to that means. This tool is surprisingly simple to work with, as can be seen with the initial setup process. Next week, I will be covering what you do now that you have the tool successfully installed and set up on your computer.

What’s in the next post:

  • Create the baseline changelog file for your database
  • Adding and tracking ongoing database changes
    • Method 1: Adding scripts to changelog file
    • Method 2: Making changes in your IDE
  • Structuring your changelogs according to best practices
  • Outstanding questions
  • Summary

More Postgres vs. SQL Server

Welcome to a coffee break post where I quickly write up something on my mind that can be written and read in less time than a coffee break takes.

As I’m getting further into my PostgreSQL adoption project at work, I am continuing to learn about so many more small differences between Postgres and SQL Server that I never would have expected. None of the differences I’ve found so far are profound, but they will pose challenges for all the developers we have (including myself) that have only worked with SQL Server.

Postgres does not have default constraints

That’s right, there is no way to make a default constraint on a Postgres table, instead you make a default value, which cannot have a name assigned to it. In SQL Server, you can define a default constraint in essentially the same way as you would define a unique or key constraint and you can give that constraint a name. In Postgres, you simply specify that a column has a default value when adding that column to a table. In theory (I haven’t tested it yet), the generation of default values works exactly the same between the two engines, one just isn’t saved as it’s own script or file with a name.

The index differences are amazing (and confusing)

I’m not going to lie, the prospect of figuring out when to use what kind of index with Postgres is daunting to me. Deciding which type of index to use with SQL Server is very straightforward, since you only need to know if you already have a primary key/clustered index on the table already and then go from there. You pretty much only have clustered or non-clustered indexes in SQL Server. But Postgres has a handful of different possible options and you need to better know how the data will be queried in order to pick the best one. The default/usual best option seems to be a B-Tree index, which is comforting and familiar, but one day I know I’ll likely have to figure out the other options as well.

The default length of object names is really short

Of these three new items I learned recently, I think this one is going to be the most annoying. The name length for objects in Postgres is surprisingly low, only 31 characters. The max length for objects in SQL Server is much longer. I’m mostly concerned that we’re going to need to figure out how to rename many long object names as we migrate off SQL Server, and that could be tedious. But we’ll see how that goes. I read somewhere in my research that you may be able to change this default length, so that will need to be something else for me to research.

What is Liquibase?

Liquibase is a tool used to track and deploy changes to databases. It can connect with Git and other software, including CI/CD software, allowing your team to collaborate and track changes to your databases. There are several different software options similar to (and even better than) Liquibase, but what makes this tool unique is that it has both an open source version, which could benefit smaller organizations that don’t have as much cash to spend on database software, and a “Pro” version, which gives more features for those who need it and can pay the additional cost.

I’m currently reviewing this software and playing with it for a project at work, and I’ve found it a bit challenging to figure things out from the Liquibase websites and documentation. I’m summarizing my findings here not only for others who may read this in the future but also to help myself get everything organized and clarified in my mind. This will be a two-part series of posts, with the second providing a deep dive into how to use Liquibase with your database.

What’s in this post:

  • Why would you want to use this tool?
  • Features of open-source edition
  • Features of Pro edition
  • How to get 30-day free trial of Pro edition
  • Other options for similar tools

Why would you want to use this tool?

You want to be able to track and manage changes happening in your database

In the modern development process, all code, including database code, should be tracked using source control. Using source control enables you to have a complete history of what changed when, and who made that change. Having this level of tracking can be great for overall documentation of what you have in your systems, but it can also help immensely in troubleshooting bugs and other issues in your code to find when an issue started happening. It also enables you to quickly revert any bad or unintended changes without having to update all the code again manually. This is a well-known and well-used system in the software development realm but doesn’t seem to be quite as popular in the database realm, at least from my own experience. Liquibase can help organizations start on this journey.

You want to automatically deploy database changes to multiple environments

When you first start using Liquibase, you can generate a changelog file that acts as a baseline for recreating the database, with the file containing all of the SQL queries necessary to recreate all objects within the database. Then, as you make more changes, those will be added to one or more files that can be tracked with Git or other source control to help with deployments and change tracking of your system. If you are looking for a way to automatically deploy your database changes through pipelines, Liquibase should be able to help with that, although I haven’t gotten that far in my project so I can’t say for sure. But based on how the tool is set up, you should be able to use the software on a build and deployment server and have it execute the changelog scripts you create on databases in other environments. You can read more about the integration with CI/CD systems on the Liquibase website.

You need a database change-tracking tool that works with multiple database engines

On the Liquibase website, they claim they can work with over 50 database engines, which is a huge claim to make and an accomplishment to have. There are very few other options for this type of tool that can claim to support more than just SQL Server, let alone 49 other database options. I am looking at this tool exactly for this reason because I need to be able to manage database changes on at least SQL Server and PostgreSQL going forward, and perhaps others if needed since it supports so many different options. At this point in my current tool exploration project, I have only played around with the tool and how it works with Postgres since that’s the future my department is looking at, but I’m sure it would be just as simple to set it up to work with a SQL Server database as it has been with Postgres.

Features of the Open-Source Option

I’m not going to go through and list every single one of the features of the open-source version of Liquibase because you can see that on their website. Instead, I will discuss the features I am interested in and why I think it’s interesting that those features are available completely for free.

Run Preconditions Before Executing SQL Changes

The concept of the preconditions interests me because it almost combines a custom system my department uses to validate data with the change tracking and deployment of changes. Although I haven’t had a chance to use the preconditions in my testing yet, it looks like a cool feature because it allows you to validate data using a SQL statement before the next change query is executed. For example, if you are adding an ALTER TABLE statement to your changelog file to drop or change a column, you can first add a precondition that will check to see what the data looks like in the column to ensure it doesn’t run if data exists in the column. With the precondition for SQL files, you specify an expectedResult value and then the query you want it to run, and if the results of the query don’t match the expectedResult value, it will fail the execution of the file and the changes won’t get deployed. This would be very useful for ensuring you’re only running code in the scenarios you want it to be run in.

Preview SQL Changes Before Running Them

I’ve already used this feature in my testing to see what exactly Liquibase is doing under the hood when I run certain commands and apply changes, and it seems useful to me. I, like a lot of database developers, am paranoid about knowing what exactly I am about to execute on a database, so Liquibase provides that capability with the command “update-sql” which will print out in the CLI all of the SQL queries that will be executed when you apply the existing changelog and changesets to your database, including the ones that the tool runs in the background to track the changes in the two log tables it uses.

Automatically Detect & Script Change for Non-programmable Objects

When I was first working with Liquibase for my proof of concept project, I thought that the command to automatically generate changelogs based on the current state of the database was only available in the Pro version. But I finally figured out that my assumption was incorrect, you can use the “generate-changelog” command even on the open-source version, but it will only generate queries for what I call non-programmable objects, everything but functions, procedures, etc. If you are working with the open-source version of Liquibase and use the “generate-changelog” command, it will generate a new changelog (or overwrite an existing one if you specify an argument) containing SQL queries to recreate every object in your database that isn’t one of the programmable type objects.

I was happy to find this command in the Liquibase arsenal because it’s a working method that my department uses regularly with our current database change management and source control tool. The normal workflow for a developer making a database change is to go into the database, make the change using SSMS, and then use our existing software to automatically detect the change and generate a migration script to track and implement that change upon deployment. This was a workflow that I was hoping to not lose as we change tools, and it looks like we won’t lose that if we switch to Liquibase. In my next post, I’ll cover both of the two different methods of creating and tracking changes with Liquibase, which each have a different angle of attack for where to make the change.

Features of the Pro Edition

I have been working with a 30-day free trial of the Pro edition of Liquibase to see what features it offers in that edition that we might need. If you would like to see a full list of the upgrades you can get with Liquibase Pro, visit their website to compare it against the open-source version.

What I found while working with the tool is that you do not need the Pro edition to use the command “generate-changelog” like I originally thought. If you would like to be able to automatically script out programmable objects like stored procedures and functions, you will need the Pro version for that. However, if you only want this tool to track changes to tables, foreign keys, primary keys, and other constraints (see the documentation for the entire list), then you would be fine to use the open-source edition of Liquibase.

Another big feature of the Pro edition that most people are probably interested in is that it is the only version of the software that you can integrate with CI/CD tools to do automatic deployments of the code you’re tracking with the tool. This feature would be a big one for my company since I doubt we are okay with going back to manual deployments of database code across dozens of databases in multiple environments.

The final reason you might choose the Pro edition over the open-source edition is that you can get a much greater level of support from the company with the Pro version (which makes sense). Pro comes with what they define as “standard” support which is through email. If you would like more advanced and involved support, such as 24-hour emergency help, you can add that to your subscription for an additional cost.

How to get a 30-Day Free Trial of Pro

If you would like to try out the Pro version of Liquibase to see if it meets your needs, they do offer a 30-day free trial through the website. You will need to go through the process of giving your email and other contact information, as well as setting up an appointment with one of their representatives to get the license key. In my experience, since my company is seriously considering this tool as an option and I can’t do the contact or negotiation for licenses (someone else handles that), I went through the meeting setup process but then contacted the representative and told them I would be unable to make the appointment, and still received the trial license through an autogenerated email. As of writing this, I still haven’t been bombarded by sales emails or calls from the company which I appreciate. It’s nice to be able to try out the full version of the tool without being harassed for it.

Other options for similar tools

While there are other very expensive options for database change management and tracking software available, Liquibase seems to be unique in that it offers a lot of useful features without requiring you to pay for them. They outcompete the other free options by a long shot since they can work with different database engines. It’s hard for them to compete with options like Red-Gate SQL Change Automation or Flyway since those tool suites are robust and expensive, but they are a viable option for people who don’t want to be stuck paying thousands of dollars per year for this type of software.

To see a more complete list of alternatives for this type of software, you can review the list that originally helped me in my search at DBMSTools.com.

Want to learn more details about how to work with Liquibase?

Next week, I will be posting about how specifically to set up and work with Liquibase since the startup documentation on their website was scattered and hard to move through as a beginner with the software. I thought I would make it easier for others to learn how to work with this tool by posting the notes I made while working with it.

Differences Between Postgres & SQL Server

A couple of weeks ago, one of my coworkers sent our group an article from Brent Ozar that they found interesting. The article is “Two Important Differences Between SQL Server and PostgreSQL”, which is a short and sweet discussion about two differences between the database engines that might trick or confuse a developer who’s learning PostgreSQL after working with SQL Server (someone exactly like me). After I read his post, I went down a rabbit hole of researching all of the little things left unsaid in the article that I hadn’t thought about yet since I’m only beginning my transition into Postgres. I recommend you go read that article, but here are the takeaways I learned from it as well as the extra information I learned from my additional googling.

CTEs are very similar but also different in Postgres vs. SQL Server

As Brent Ozar says in his post, CTEs are fairly similar between Microsoft SQL Server (MSSQL) and Postgres, but the main difference is in how the queries are processed by the engines which can affect how you should format your queries and their filters. If you don’t change your CTE writing style when switching to Postgres, you may end up with performance issues. This scares me a little simply because I’m sure this isn’t the only “gotcha” I’ll encounter while transitioning our systems to Postgres. However, I will always look at the whole project and each new challenge as a learning experience so that I don’t get overwhelmed by the mistakes I’m bound to make.

The one good thing I learned in my research about Postgres CTEs after reading Brent’s article was that I can continue to use Window Functions just the same with them, which is usually why I use CTEs with my current work anyway. I use the ROW_NUMBER function with CTEs quite often, so I was happy to see that the function also exists in Postgres.

I have taken simple query formatting for granted my entire career

After reading Brent’s post and the other subsequent articles to learn more about Postgres DO blocks, I came to the realization that I’ve really taken for granted the nice and simple query formatting, for both ad hoc and stored queries, that SQL Server and T-SQL provide. Even if I’m running a one-time query for myself to get data needed to develop a stored procedure, I use IF statements quite frequently and that is so simple to do with SQL Server. I only need to write “IF… BEGIN… END” and I have a conditional code block that I can run immediately to get the data I need.

Doing that in Postgres is a little more complicated, especially if I’m wanting these conditional statements to give me a result set to review since that isn’t possible with DO blocks in Postgres. In order to run conditional statements with Postgres, you need to use a DO block to let the query engine know that you’re about to run conditional logic. That itself is simple enough to adapt to, but there was one aspect of these code blocks that confused me for too long, which was the “dollar quote” symbols often seen in Postgres code examples. These were confusing because I wasn’t sure why they were needed or when to use them and everyone used to working with Postgres took that information for granted and never explained it with their examples.

In Postgres, when you define a function or DO block, the code within that block must be encapsulated by single quotes, which is different from SQL Server which doesn’t require any kind of quote encapsulation for the code contained in an IF statement. Most developers use the dollar quote styling method instead of single quotes because doing so prevents you from having to escape every single special character you may be using in your code block, such as single quotes or backslashes.

DO $$
<code>
END $$;

Additionally, it is possible to specify tags between the beginning and ending double dollar signs (ex: DO $tag$ <code> $tag$) which can help you organize your code to make it easier to read. As far as I can tell, the tags do not provide any function besides styling code for easier reading.

Once I understood these dollar quotes better, I felt more confident about being able to write Postgres SQL code. But there were still a few other catches with the code that I wasn’t expecting, coming from MSSQL.

Postgres is very specific about when you can and can’t return data from your query/code

Of all the Postgres info that I have learned so far, I think this one is going to get me the most due to how I write queries on a day-to-day basis. As I said above, I write a lot of ad hoc queries throughout the day to view and validate data for many purposes. Postgres’ rules about what you can return data from are much different than MSSQL. In SQL Server, it seems like you can return/select data from anywhere your heart desires, including functions, procedures, conditional statements, and ad hoc queries. Postgres only allows you to return data from two of those options, functions and ad hoc queries (as long as it doesn’t include conditional logic).

If you have a chunk of code that you want to save to run again later in Postgres, and that code returns data with a SELECT statement or a RETURN statement, you must use a function and not a stored procedure. Stored procedures in Postgres are only meant to perform calculations or to do anything else besides return data. If you put a SELECT statement in a procedure in Postgres, it will NOT display that data like you would expect or like you are used to with SQL Server. This will affect my organization greatly when we move to Postgres because we have a lot of stored procedures whose sole purpose is to return specified data with a SELECT statement. All of those procedures will need to be converted to Postgres functions (which isn’t that big a deal, but it definitely needs to be kept in mind).

You also are unable to return data from a DO block with conditional statements as Brent Ozar mentioned in his post. If you want to have conditional logic that returns data for you to review, that will need to be done with a function. This one is a little crazy to me since that is the opposite of what’s possible with SQL Server. I can run this statement without issue as an ad hoc query in SQL Server, but it wouldn’t work in Postgres DO block:

DECLARE @Today VARCHAR(10) = 'Friday';

IF (@Today = 'Friday')
BEGIN
	SELECT 1 AS TodayIsFriday
END
ELSE
BEGIN
	SELECT 0 AS TodayIsFriday
END

This will take getting used to.

Postgres has the ability to understand other programming, scripting, and query languages

This is the most interesting new thing I learned about Postgres with this research. It is fascinating to me that this database engine allows you to easily write and execute code written in other languages and that it will work just as easily as normal SQL code. When you write a function in Postgres, you are able to specify what language you are writing the code in, which can be many different languages. If you don’t specify any, it will default to standard SQL.

Postgres comes with four different language modules installed, which are PL/pgSQL, PL/Tcl, PL/Perl, and PL/Python. The “PL” in those module names stands for “Procedural Language”. Tcl and Perl aren’t as interesting to me personally since I’ve never heard of the first and have never used the second, but the other two built-in language options intrigue me.

PL/pgSQL interests me because that’s the standard Postgres SQL language module available which gives you a lot of coding features that standard SQL doesn’t provide, such as custom functions, procedures, complex computations, etc. The types of functionality that I assumed were normal for database engines since they’re built-in to T-SQL/SQL Server.

PL/Python also interests me since I’m learning Python at the moment, in addition to learning Postgres, and it seems like the most useful scripting language I’ve ever used. If you can integrate Python with the database engine so easily, I can see getting a lot of use out of that to pull and analyze data. But I haven’t yet used this functionality with Postgres so I can’t say for sure.

Conclusion

Overall, I am excited to learn these new things about Postgres because I am starting to feel more ready for our migration from SQL Server to PostgreSQL. As I’ve written in posts before, I am a natural procrastinator and had been procrastinating starting to learn Postgres until my coworker sent me the Brent Ozar article. I am super thankful that he sent that, because it led me down this wonderful rabbit hole of learning many differences between Postgres and SQL Server that I’m sure will benefit me going forward.

The below list is all of the resources I used while learning the above information. I thought the EnterpriseDB link was the best overall summary of different topics while the rest really get into the weeds of different Postgres features.

  • https://www.postgresql.org/docs/current/sql-createprocedure.html
  • https://www.enterprisedb.com/postgres-tutorials/everything-you-need-know-about-postgres-stored-procedures-and-functions
  • https://www.postgresql.org/docs/current/sql-do.html
  • https://stackoverflow.com/questions/12144284/what-are-used-for-in-pl-pgsql
  • https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-DOLLAR-QUOTING
  • https://www.geeksforgeeks.org/postgresql-dollar-quoted-string-constants/#
  • https://www.postgresql.org/docs/current/tutorial-window.html
  • https://www.geeksforgeeks.org/postgresql-row_number-function/
  • https://www.postgresql.org/docs/8.1/xplang.html
  • https://www.postgresql.org/docs/current/plpgsql-overview.html
  • https://www.postgresql.org/docs/current/sql-createfunction.html

How to Clean Up Old Local Branches With Git

If you use Git Bash or another form of Git on your local development machine for version control in your organization, have you ever looked at how many branches you have from old work? Sometimes I forget that Git is keeping every single branch I’ve ever made off of master in the past unless I manually go in and delete it. This means I end up with an insane number of local branches hanging out on my computer, some of them months old. It’s not necessarily bad that these local branches are still around, but I know that I will never need them again after the related ticket or software change has been deployed to Live. Any pertinent information that might be needed for reference for that branch is stored in our remote repo which means I don’t need it hanging around on my machine.

When I finally remember to check how many local branches I have on a repo (using the command “git branch”), I am shocked to see dozens upon dozens of branches like in the above screenshot (which is only about half of the old branches I have on that repo). Then I want to get rid of them but also don’t want to use “git branch -D <branch>” for every single individual branch to clean them up one by one since that would take me quite a while to complete.

The faster way to get rid of all local branches, taught me by a coworker, is the following: “git branch  | grep -v “master” | xargs git branch -D”. Note: use this with caution because it will delete everything and you don’t want to delete something that you still need. Also, there are some caveats with which this command won’t work, and you can read more about that on StackOverflow.

TL;DR: the above command will fetch the list of all branches available on the current directory/repo, will get all branches except the one you specify with “grep -v” (so that you can keep the local master branch), and will then delete all of those branches with a force delete.

Let’s break down each part of that command:

  • “Git branch”
    • This is the command that will list all local branches on the current repository
    • Using a pipe character (the vertical bar “|”) will tell the command to feed in the results of what’s on the left of the pipe into the command on the right of the pipe, which in this case means we are feeding the list of local branches into the command “grep -v “master””
  • “grep -v “master””
    • The grep command will print output matching a specified pattern
    • The option “-v” signifies that the inverse of the list matching a pattern should be output
    • In this scenario, the two above points mean that this full command is going to take the list of all local branches and print out all of them that aren’t the specified branch, which in this case is “master”. If your main branch isn’t called master, you can change that value to whatever branch you don’t want to delete with this command.
  • “xargs git branch -D”
    • I haven’t been able to definitively figure out what the xargs command is doing (if anyone has documentation on this, please send it my way!), but essentially it seems to be taking the list of branches created with the two previous commands and running that list through the normal “git branch -D” command which will perform a hard delete on those branches.
    • “git branch -D” is the command used to force a delete of a branch (the -D is short for using the two options “–delete –force”)

This isn’t the most necessary Git command you’ll ever use in your work, but it does come in handy to keep your work organized and decluttered if you’re someone like me who values that.

How to Stay Organized While Busy at Work, Part 2

Welcome back to part two of my discussion on ways to help manage stress and chaos at work. These posts are specifically aimed at database and application developers and IT personnel as a whole, although I think most of the topics would also apply to other office workers. Today’s post will cover the final three topics in the list below. If you would like to catch up on last week’s part one post, you can find it here.

Once again, I hope you find something valuable in this post, and I would love to hear your thoughts about how you’ve implemented these and had them work (or not!) in your own life. Thanks for reading!

  1. Your calendar is your best friend, use it to your advantage
  2. Only work on one thing at a time
  3. Every work request should be a ticket (mostly)
  4. Set times to catch up on email and messages throughout the day
  5. Do the hardest and most important work first thing in the morning
  6. Track what you spend your time on

Set times for yourself to catch up on emails and messages throughout the day

Going back to the advice to only work on one thing at a time, that should also include emails and messages. If you’ve never heard of the concept of context switching, I think you should read into it because it can be a huge time-sink if you don’t control it as much as possible. (Read about context switching from Atlassian here.)

While I am working, I find that if I am constantly being bombarded with messages and emails from other people, I get incrementally more stressed as the day goes on because it all starts to be overwhelming, especially when the group chats are going at 100 miles per hour some days. Plus, the context switching of always losing focus to read what new chats are coming in as they come in is tiring and prevents me from getting important things done. This is why I’ve set a rule for myself that I will use Focus Time like I said in part one, and will give myself dedicated time throughout the day to focus on working, but will also mentally set times throughout the day to stay connected with coworkers.

I try to work with a loose version of the Pomodoro Method of intensely focusing on a task for 25-50 minutes and then taking a 5-10 minute break from that task. I used an actual timer app in the past, but now I don’t use it and mentally keep track of how long I’ve been focused on one thing. Then when the break time comes, I will quickly catch up on and respond to any messages I’ve received while focusing. Emails I only check about 3 times a day, after my morning standup, before going to lunch, and before leaving for the day, since usually email is not as urgent as Teams messages. This method gives you the best of both worlds of still being able to help your teammates with their work while still being able to get your work done without too much context switching.

Do the hardest and most important work first thing in the morning

This concept is what many in the self-help world refer to as “eat the frog”, which I think is a pretty weird name, but the concept is a good one. I am a natural procrastinator. You would have thought I would have learned my lesson with many late nights in college trying to finish work the night or two before a deadline, but I guess not. My procrastination comes from a fear that I won’t know what to do when I get into the task I need to complete, and then it will be scary because I won’t know how to proceed. However, not once in my professional career so far have I ever been assigned a task that I knew 0% about what needed to be done, even if all I know about the task is that someone wrote a vague document about it 5 years ago or that it needs to use recursion to get the data.

With all of this in mind, in about the last 6 months or so, I’ve focused on starting my day out by working on the most challenging and important tasks first thing in the morning. I usually start work around 7 AM and have my first meeting each day a 9:15 AM, so I know I usually have about 2 hours to buckle down and focus on that one difficult task, so I have no excuse to not focus on the task. I still don’t like doing the challenging thing, but doing it first thing in the morning makes sure I have the mental energy to do it because I’m freshly awake and am sipping on my morning coffee. Starting early also means that when I inevitably make good progress on it before lunch, I can have a sense of accomplishment and a lighter load after lunch (when the onslaught of meetings normally begins). For the scenarios where I don’t make significant progress in the morning because the task is that challenging, at least I know I put my best effort of the day into it to make some amount of progress instead of procrastinating on it and becoming even more afraid of the task at hand.

Track what you spend your time on

This relates to my earlier advice about making every work request a ticket, but this advice to track what you spend your time on isn’t exclusive to ticket work. I also recommend that you track the time spent on other activities as well so you know what truly is taking a lot of time in your work day. If you are constantly helping others (which can be good) instead of doing your work, you may look back on the day or week and wonder where your time went and why you weren’t able to finish your tasks. I have been in that exact position, which is why I started keeping track of all the work I complete throughout the day, including small calls with teammates to help them with their work, meetings, and even presentation work time. I do this, including keeping a general summary of what I did or learned with each of my tickets, with a daily OneNote page, but you can track it in any way you would like.

In the past, I tried using Trello and other software to do this timekeeping for me, but it always ended up being too complicated so I would inevitably stop using the software and stop tracking what I was working on. This year, I decided to make a OneNote page for each day of work and made a default template to use for all new pages that would give me the list of things I want to track each day. Then at the end of the week, I compile and summarize the daily work into a page that I call “Weekly Summary”, and add any notable work into my list of accomplishments for the year. This method speeds up and eases the process of identifying accomplishments so that when the annual review time comes around, I will have plenty of items to pull from my list instead of trying to remember it all myself (which I don’t because I have a terrible memory for these things).

My daily summary page contains the following items:

  • Tickets worked on (I will write about the challenges I’ve faced with the ticket today as well as the progress made on it)
  • Tickets closed
  • Resources found/used (this is where I keep track of the random StackOverflow pages or blogs I’ve used to help myself with my work throughout the day)
  • Other work completed (this is where I list any calls, meetings, etc.)
  • Other notes

Once I made the OneNote page template with these items, it became much easier for me to stick with completing the daily summary since I never had to think or retype the main bullet points. This process has helped me define how much time I spend on various activities throughout the day and week which gives me perspective on how much I do (a lot).

Your work time is important, guard it ferociously

My goal for these posts was to give others ideas for how to tame their workload to make it feel more manageable and less overwhelming. I hope that some of these ideas spark something for you to implement in your work life. Overall, I would like to get the point across that your work time is important, and your mental health relating to work is important, so guard both of these things ferociously. Managing your time and not letting others make your life hectic can help prevent burnout, or at least slow it down. These strategies have helped immensely in my chaotic work life, so I hope they can also help with yours.

How to Stay Organized While Busy at Work, Part 1

Recently, I was suddenly given responsibility for all database development work for the application I support, plus many other work items not directly related to the application, when previously I had been splitting that work with another developer. I wasn’t expecting this change, and neither was anyone else, so I was immediately overwhelmed with the amount of work, questions, and requests for review that were coming my way. The first week of this new responsibility was chaos, and it made me realize I needed to tighten up my work organization strategies. I’ve always been someone who keeps my work organized and tried to keep myself to a set pattern of efficient behavior, but it wasn’t as regulated as I needed to keep on top of the flood of work that suddenly came my way. I quickly developed a set of organization and work strategies to make sure I stay on top of everything I’m responsible for while also not being extremely stressed by the workload. Plus, I love helping other people and didn’t want to stop helping other developers because of the new workload, so I made sure my new strategy allowed time to continue with that.

If you’re feeling overwhelmed, stressed, and disorganized with your work and would like some ideas for getting organized to reduce those problems, keep reading. I ended up having a lot more to say on this topic than I originally thought, so I’ve split this topic into two posts. This post will cover the first three ideas on the list below, and the final three will be covered in next week’s post.

  1. Your calendar is your best friend, use it to your advantage
  2. Only work on one thing at a time
  3. Every work request should be a ticket
  4. Set times for yourself to catch up on email and messages throughout the day
  5. Do the hardest and most important work first thing in the morning
  6. Track what you spend your time on

That may seem like a lot of items to add to your already stressful work life but trust me, getting and keeping things organized takes a huge load of stress off of your shoulders. You can’t control what craziness gets thrown at you by others, but you can control how you react to the craziness and how you structure your day to handle that craziness. Plus, if you pick just one or two to start with, ones that would benefit you the most with the least amount of effort, it won’t feel like a burden to use these strategies.

Your calendar is your best friend, use it to your advantage

Seriously, if you’re not using your work calendar already to manage meetings and schedule yourself time to work on what you need to get done, you should start doing it immediately. Whether your calendar is shared with others or not, blocking times for yourself to work on specific tasks can allow you to prepare for that work time mentally and then fully concentrate on that work when the time comes. Plus, it keeps you accountable for getting work done. Here are my 3 quick tips on how you can utilize your calendar to its fullest:

  1. If you use Outlook and your organization has the feature enabled, use the Microsoft Viva plugin to schedule Focus Time for yourself every day. Focus Time is a feature that will automatically schedule blocks of time on your calendar, two weeks at a time, for you to focus on what you need to. While in Focus Time, your status on Teams will be changed to Do Not Disturb so that you won’t get all the pesky notifications of chats, posts, updates, etc. to reduce distractions. I have loved this feature since it got added because it gives me a dedicated and justifiable time for not responding to the constant chats I get throughout the day. Most people in my organization understand that Focus time is sacred and that I won’t be responding to their chats until after the allotted time has ended.
  2. Schedule work time for the most important items you need to complete. Put it on your calendar and have it set your status as Busy for that time. The most important effect of doing this is it will let others know not to schedule meetings over this time (unless they don’t respect calendars in general, which is a separate problem). I usually schedule myself non-Focus Time work times when there is an important meeting coming up and I need to prepare for it. I will set aside 30 minutes to an hour before the meeting when I know I will work on that one task. And when the time arrives for you to work on whatever it was that was scheduled, don’t respond to emails or chats. Setting this time aside for yourself will help keep you on top of what needs to get done when it needs to get done so you’re never unprepared for a meeting or other task again.
  3. On the same note as #2 above, at the beginning of each week, block out time on your calendar every day for lunch and breaks, if you take them. I have found that I am the most productive and feel the best personally when I take a midmorning break, an hour lunch break (to go the gym), and a midafternoon break. I have started blocking out those times on my calendar which prevents people from scheduling over my breaks and also gives me reminders when it is time to take a break. If I didn’t have the reminder pop up, half the time I would forget I need to take one and would then feel burned out at the end of the day.

Only work on one thing at a time

This piece of advice is easier said than done, especially if you work in an organization that often suffers from poor planning or conflicting priorities, but I would like to say it is possible for everyone. It’s at least physically the only way to work (unless you’ve developed a way to code on two different programs at once). Since you can only physically work on one project at a time, and have your focus directed to a single thing at once, that is the best starting point to fight for yourself to get buy-off from management or project managers for only working on one thing at a time.

For me, when things got crazy at work, I realized that I could no longer handle the stress of trying to accomplish all of the business goals within the same time frame they were originally scheduled for. I have worked extremely stressful jobs in the past and had vowed I would never put up with that again, so I had to set boundaries for myself in my current work to reduce the stress. I began pushing back on the analysts and program managers who decided on work priority to give them the burden of making the difficult prioritization decisions that I felt I was facing, given there was now only one DB dev doing the project work.

As an example, during the craziest week of my life at my current job, I was already assigned two tickets, one of them high-priority. Then I went on vacation and came back and was assigned another high-priority ticket that had a deadline in less than two weeks and it was something I had never done before. As soon as I saw that mess, I went to our SA and asked him which of those 3 “high priority” tickets was the most important, stating that I could only work on one task at a time and there were only so many hours in the day, so they needed to tell me the order in which I should work them, according to the business needs. Within a few hours, I had an ordered list provided to me as to what I should be focusing on that day.

But after you push back and get a truly prioritized list of work items, you then need to keep your boundaries in place, no matter what else tries to happen. If others are coming up to you and asking you for a lot of help with whatever they’re working on, tell them that you are currently unable to help but would love to help later after you’ve finished your current task (within reason, if you work in a collaborative environment like I do, you can’t blow your coworkers off all the time). Or if someone else is trying to assign you more tickets that need to be done “right now”, push back on them and your manager and make everyone else work together to figure out how the new task fits in your current list, and switch if needed.

Every work request should be a ticket

This piece of advice is one I’ve used on myself for both of my development jobs so far in my career, and I think it’s one of the easiest to enforce. If you are being asked to do work that will take 30 minutes or more, no matter what it is, create a ticket for it or have the requester make the ticket. I do this to cover myself because I never want to be the person who’s eternally busy but with no metrics to show where their time goes. Even if others don’t admit it, they may be wondering what you’re doing all day if you have no hours logged on tickets. If you set a standard that everything gets a ticket, you won’t ever have to worry about that. Also, having everything as a ticket can help with the prioritization of all work on your plate and helps keep all your current and future work documented and organized. Plus, I like to keep everything in tickets for the satisfaction at the end of the year of seeing how many work items I completed, along with the number of hours total I spent on everything I did (I love data, even on myself).

Conclusion

The three ideas above are only half of the story of what I have been using recently to keep my chaotic work life more organized. I won’t promise that it will make everything sunshine and rainbows, but it at least keeps the chaos reigned in a bit and brings it down to a manageable level. If you’re interested in reading more about the final three methods I’m using for organization, that post will be going up next week. I hope these suggestions are as helpful to you as they have been for me!

Highlights of PASS Data Community Summit 2023

This year I was able to attend the PASS Data Community Summit for the first time, and it was also the first time I had ever attended a professional conference in my career. Not only did I learn more than I expected about new and existing databases and tools, but I was also able to meet many interesting new people that I never would have been able to connect with otherwise. Each of the people I chatted with at various points throughout the conference had a unique story to share and was more than happy to introduce themselves, share their experience, and offer me help going forward if I need it. Going in as someone who’s never been to such an event before, I was relieved to find that I was fully welcomed and not made to feel like an outsider.

While I really enjoyed meeting so many new people from all around the world (I can’t believe how far some people traveled!), the best part of the conference was learning about many new and useful topics. These are the most interesting highlight items I learned across the three days of the conference:

Red-Gate Test Data Manager

I think this was the most exciting thing I learned while at the conference because it would be the most beneficial for my job and company. Currently, my organization has a homegrown tool that pulls down a subset of test data (from our QA environment) to each developer’s local database. While this tool is useful and has been mostly sufficient for us since it was created, it’s becoming harder to maintain and less useful to developers over time. The main issue we have with our custom tool is that it’s set to only pull down 500,000 records for any given table, with no regard (for the most part) for getting useful data for testing. Often, when trying to test a complicated or totally new process, the data subset either isn’t enough records or the records aren’t what is needed for testing.

That is why the Test Data Manager (TDM) from Red-Gate is so interesting to me. It solves all of the problems we currently have with our custom tool plus more we hadn’t considered yet. It allows you to create a clone in less than a minute from production data (my organization’s tool currently takes a long time to pull down all of the data and it pulls from our QA environment which isn’t as good of a testing set). TDM also allows you to mask any Personal Identifiable Information (PII) while keeping that data relevant, related, and useful after it has been masked (our custom tool isn’t that smart). Additionally, you can select a premade subset of data for any given table (small or large) or you can specify your own custom type of subset (e.g. only pull customer data for 2022 customers), and the software will ensure the data stays related and meaningful as it creates the subset of data.

Unit testing for databases is possible

You might think this is obvious, but the fact that unit testing can be added to database code now is crazy and exciting to me. I figured that it would be possible to write unit tests for stored procedures since that’s just code, but there’s a whole world of possibilities for other ways to unit test databases using tools already created and perfected by others. The two main tools that were discussed in regards to database unit testing were tSQLt, an open-source testing framework, and Red-Gate’s SQL Test which utilizes that framework while also giving additional capabilities for your team. While I haven’t had a chance to try to use either of these tools yet, I look forward to exploring the world of unit testing. I would love to be able to have a way to automatically test database code changes, since currently, all testing of database-related code changes, whether it’s SQL or SSIS, must be done manually by someone else on my team (and testing resources have been scarce lately).

PostgreSQL (PGSQL) is similar but also really different from Microsoft SQL Server (MSSQL)

Over the course of the Summit, I went to many different sessions related to PostgreSQL, some of them more helpful than others. What I learned is that there are multiple different tools in the AWS platform that could be useful to others to help in their transition from MSSQL to PGSQL, but those tools might not be as useful to my company. We are considering a full transition to PGSQL in the coming years for cost savings, which means that we will want to completely move our current application systems from one engine to the other.

One of the AWS services I learned about that didn’t seem like it would be as helpful to us is BabelFish, which is a service that allows you to create an instance of PGSQL in AWS and still use your existing TSQL queries to interact with the database. This service seems like it would be very helpful for anyone who is looking to migrate a single database instance, or maybe up to a few, to PGSQL without having to rewrite their entire codebase, since it lets you keep working with your same application code. BabelFish probably won’t be helpful to my team in our migration efforts since we do want to do the full switch, so there would be no need to spin up BabelFish instances that we would later have to migrate fully anyway (especially since we’re looking at moving dozens of databases).

The other AWS services that were discussed for migration from MSSQL to PGSQL, Schema Conversion Tool (SCT) and Database Migration Service (DMS), seem like they would be closer to what my organization needs for our transition to Postgres. These tools allow you to convert most of your existing TSQL codebase to Postgres without any work on your part. However, AWS doesn’t claim that the converted code is going to be aligned with PGSQL best practices or will be performant, so that could be a challenge we’ll have to deal with later. The first half of the migration process with these tools is SCT which does the conversion of tables, schemas, functions, procedures, etc. from TSQL to PL/PGSQL Then the second half of the toolset, the DMS, would help with moving the data from one server to another. If I was going to guess how my organization is going to manage the transition from one engine to another, I would say that we’ll likely use these two tools to get us started, recognizing that they won’t get us 100% converted and we’ll have to do some manual conversions ourselves as well.

The last thing I learned about how Postgres differs from Microsoft SQL Server is that the indexing system is much more customizable and specific in PGSQL than in MSSQL, and that means I will need to learn a lot about indexes in the former. MSSQL only has a few different index types you can use, clustered, nonclustered, and columnstore, while PGSQL has at least 5 or 6 different kinds that are built for very specific use cases. While it may be a bit confusing to me, a career-long Microsoft database developer, to figure out the different types of Postgres indexes available when we start the transition, I’m going to focus on what an interesting learning opportunity it is. This will be the first large database engine change I will have in my career, and learning a new system will make me a better developer and a more lucrative job candidate in the future.

AI is being added to tools many of us use daily, to help us work more efficiently so we can do the more important and interesting work

This is one of my final takeaways from the conference, but I don’t think it will be quite as useful or relevant for my current role as the other things I listed above. A large part of the sessions at the Summit this year were focused on how AI is going to start changing how we work, and that we are going to need to adapt and learn how to use it to benefit ourselves instead of being left in the technological dust by refusing to use it. AI certainly isn’t going to be stealing our jobs in the next couple of years, but it will change how we work so we should prepare for that.

For myself, I plan on using AI tools currently available (mainly ChatGPT) to try to help me be more efficient in my role, probably with non-coding tasks such as writing emails and documentation. AI can help you work much more efficiently so you can spend most of your time on what is most important to you and your company instead of the mundane daily tasks that hog your time. That is the most exciting takeaway from the AI discussions I attended at PASS.

In the future, when other new AI tools, such as Red-Gate’s SQL Prompt+ (which was announced at the conference) are released, I also intend to play around with those to see how they might increase my work efficiency. The demo that the Red-Gate team gave for how SQL Prompt+ can be used is to highlight a query in a stored procedure and ask the tool for recommendations on how to improve the performance of the query. While I guarantee the tool is going to make a lot of bad suggestions that a human can easily tell won’t improve anything, I also think that it may provide a good starting point for investigating underperforming queries that you may not have thought of yourself. That is why I go back to my point of AI not taking our coding jobs in the near future but it becoming a great resource for making your job easier.

The concept of AI in general is a bit frightening (as a lover of sci-fi and horror books and movies), but it’s not likely to take over the world anytime soon. However, it may trickle into your workplace in the next few years so it’s better to confront change head-on and figure out how you can use these new tools to your advantage.

Summary

I hope that in the future, I’ll be able to attend more of the PASS Summit events since this year’s conference proved to be such a fun and relatively easy way to learn a lot of new information relevant to my career. Whether you’re newer in your career like me or an old hat in the trade, I really would recommend attending the next PASS Data Community Summit if you’re able to. You never know what you’ll learn that could improve your work life and make you a better developer.