Category: SQL Server (page 2 of 2)

Postgres vs. SQL Server, Part 3

If you haven’t seen it already, I had two previous posts listing differences I have found between Postgres and SQL Server while learning to develop in the Postgres environment. As I am continuing to learn, I am finding even more differences that I think are interesting, and they are going to be shared below.

What’s in this post

Creating a temporary table

The difference between SQL Server and Postgres for creating a temp table is small, but significant. Mostly because my company uses temp tables frequently in our stored procedure and ETL code, so as we migrate off of SQL Server, we’re going to have to do a lot of updating to move temp table statements to Postgres. The syntax for SQL Server is to use a hashtag/pound/number symbol in front of the table name, or to use a double-pound symbol to create a global temporary table, like “CREATE TABLE #MyTable” or “CREATE TABLE ##MyGlobalTempTable“. In Postgres, the syntax requires that you use the TEMP or TEMPORARY keyword to make the temp table, like “CREATE TEMP[ORARY] TABLE my_table“.

You also have to use the TEMP or TEMPORARY keyword when creating a temp table in Postgres by doing a SELECT INTO statement, such as SELECT column1, column2 INTO TEMP TABLE my_table. In SQL Server, that same command would look like SELECT column1, column2 INTO #MyTable.

General differences between temp tables in each engine

Not only is the way you create temporary tables different between Postgres and SQL Server, but how the tables are actually stored on the database is completely different between the two as well.

In SQL Server, any temporary objects are stored in a separate database called tempdb. When creating a temporary table, you can specify whether it’s localized to your current session only or whether it is a global temporary table that can be accessed from different sessions or query windows, as I mentioned above. To do the first, you specify the table name as #MyTable, then the do the latter, you specify the table name as ##MyTable.

In Postgres, temporary tables are not stored in a separate database. Instead, they are put into their own system schemas whose names are based on numerical values (ex. pg_temp_1, pg_temp_2, etc.), so it’s harder to see the specific location where your temp table is being stored. You also are generally unable to access a temp table in one session from a different session, so there is no concept of a global temp table in Postgres.

One similarity of temp tables between the two engines, though, is that they are dropped/deleted in both as soon as the session in which they were created ends.

Getting the current UTC time

In SQL Server, there are two built-in functions that allow you to get the current UTC time, GETUTCDATE() and SYSUTCDATETIME(). The former returns the time in type DATETIME, and the latter returns the time in type DATETIME2 (more precise).

Postgres also has multiple functions and methods of getting the current UTC time, but they look much different from each other than the SQL Server options look. While there are more than just these two options, the most common ways to get the UTC time are the following:

  • SELECT NOW() AT TIME ZONE 'utc';
  • SELECT CURRENT_TIMESTAMP AT TIME ZONE 'utc'

Although the two options look different, they will return the same value. I’ve had some issues trying to use one versus the other in different scenarios in the SQL code I’ve written for Postgres, so it seems like there may be times where it’s better to use one over the other, but I’m just not able to differentiate those scenarios right now. There are also other options for getting the current UTC time, but I won’t cover those here.

Updating a table with a JOIN

One of the biggest syntax differences between T-SQL and pgSQL that has really gotten me confused so far is the formatting of an UPDATE statement that includes a JOIN to a different table. To me, the syntax of T-SQL makes a lot more logical sense and the pgSQL syntax doesn’t seem as intuitive. The T-SQL syntax is as follows:

UPDATE p
SET p.MyColumn = s.MyColumn
FROM person AS p
INNER JOIN staging AS s
	ON s.MyColumn2 = p.MyColumn2

Then the syntax for pgSQL is:

UPDATE person
SET p.my_column = s.my_column
FROM staging
WHERE staging.my_column2 = person.my_column2

To me, the pgSQL syntax doesn’t scream “joining to a different table”; it really looks to me like a simple update statement without a join. Yet the join is there. Also, it doesn’t seem to let you use aliases for the query, which I also dislike since I use aliases for every query I write. I wish I could change the standard for the pgSQL syntax, but obviously that isn’t within my power so I’m going to have to learn to live with it.

Making a column nullable

Unlike the pgSQL syntax for updating a table using a join to a different table, I actually like the syntax of making a column nullable in pgSQL more than I like the T-SQL syntax. The pgSQL syntax seems much more clear and straightforward for this action. The syntax in Postgres for making a NOT NULL column nullable is as follows: ALTER TABLE my_table ALTER COLUMN my_column DROP NOT NULL;

The T-SQL syntax for the same action is less clear, since you are basically redefining the column like you would if you were adding it, instead of specifying what you are changing. If you had a column that was set to NOT NULL in a table, you would write out the whole definition of the column again but write NULL instead of NOT NULL. For example: ALTER TABLE Mytable ALTER COLUMN MyColumn INT NULL;

I’m not fond of that T-SQL syntax because you can’t tell at a glance what is changing with the column definition unless you already knew what it was before running the statement. Since that same statement would also be the same syntax for changing the data type for the column if the column was already set to be nullable.

Data type for automatically updated binary numbers

In SQL Server, there is a data type called TIMESTAMP or ROWVERSION that is an automatically updating binary value that is added to a record when that record is updated or inserted. (The TIMESTAMP synonym for the ROWVERSION data type has been deprecated.) My company currently uses this data type heavily to help identify changed data quickly for our ETLs. As we are looking at moving everything into the Postgres world, we have found that Postgres does not have any built-in functionality similar to the ROWVERSION column, at least not at the precision we would need it to be (you can read about system columns here). You of course can store binary data within a table, but there is no data type for columns that will automatically increment a binary value when data is changed with the same precision as SQL Server’s ROWVERSION.

There do seem to be some ways people have replicated that functionality using custom triggers/functions that we are considering, but I haven’t looked into it that much personally so I cannot yet speak to how well that would work to replicate SQL Server’s ROWVERSION functionality. So we are either going to have to add these custom triggers/functions or find another reliable and fast way to identify changed data between source and target for our ETLs. Soon, I am hoping to do a blog post about the possible Change Data Capture (CDC) options for Postgres that we are considering.

Conclusion

There are so many weird quirks and differences between SQL Server and Postgres that I am having to learn as I grow in the Postgres development space. While I have already listed so many of them across several posts now, I am sure there will be many more to come as I keep learning. And I will continue to compile a list of the differences and share them here on my blog as I find them.

Sources

  • https://www.postgresqltutorial.com/postgresql-tutorial/postgresql-temporary-table/
  • https://www.postgresql.org/docs/current/datatype-datetime.html
  • https://www.postgresqltutorial.com/postgresql-tutorial/postgresql-update-join/
  • https://www.postgresql.org/docs/7.3/sql-altertable.html

Getting SQL Prompt to Prompt on RDS Servers

This may seem like a ridiculous thing to need to write about, making the Red-Gate tool SQL Prompt generate prompts like it should, but I have been having a weird issue with it over the past couple months and have finally learned the solution. So of course I thought I should share it!

What is SQL Prompt?

SQL Prompt is a tool made by Red-Gate that works as a much cleaner, nicer, and more useful autocomplete feature for SQL Server Management Studio (SSMS). It is a plugin you install to SSMS that then seems to magically work to help you write queries faster. Not only does this tool autocomplete databases, schemas, tables, and column names for you in your queries, but it also provides a lot of other useful tools like a Snippets Manager, which allows you to use default and custom snippets to write code faster (e.g. writing “sf” then pressing Tab will type out “SELECT * FROM” for you so all you need to type is the table name you want to select from).

Every developer in my organization uses this tool heavily in our day-to-day operations while writing any SQL scripts because it makes writing queries so much faster. So when my SQL Prompt seemed to stop working after an update, I was getting really frustrated because it meant I had to write all of my SQL queries manually again. And when all you do all day is write SQL, that adds up to a significant hindrance to your work speed.

My Problem

The problem I was having with SQL Prompt was that when I connected to any of our RDS database instances, the tool would no longer do any prompting of schemas, tables, or columns which was making my coding life so much harder. Oddly, the snippets manager portion of the tool was still working fine, so at least I wasn’t needing to type out the queries I normally use the snippets shortcuts for. Also oddly, Prompt would work perfectly fine connecting to other databases that weren’t on RDS instances, it was only happening for RDS databases.

I dealt with this issue for months on our production server, since I figured it was due to the security settings or something else I wouldn’t be able to fix, and I don’t access prod servers very frequently, so when I did use them and the Prompt wasn’t working, it wasn’t as bothersome. But after I had to completely reset my developer computer and reinstalled SQL Prompt, I started having this same issue for our lower environment databases, that I work with every day, so Prompt not prompting was suddenly a big deal.

The Solution

I created a support ticket with Red-Gate since I’ve always had good luck with their support services. This time wasn’t any different. Within a couple hours of creating the ticket, I had an email from a support rep asking me if I had tried checking the “Trust Server Certificate” check box on the Connection Properties tab while connecting to the RDS servers. No, I had not done that because I did not know that was an option before that day.

I disconnected from the RDS server then reconnected, making sure to check that box before clicking “Connect”, and now I had SQL Prompt back up and working, providing prompts of schemas, tables, and columns just like I want it to. Yay!

I also logged onto our prod server to see if doing the same thing there would fix that issue, and it fixed Prompt there as well. I am so excited to be able to not type out every detail of every SQL query again!

I love quick fixes like this.

More Postgres vs. SQL Server

Welcome to a coffee break post where I quickly write up something on my mind that can be written and read in less time than a coffee break takes.

As I’m getting further into my PostgreSQL adoption project at work, I am continuing to learn about so many more small differences between Postgres and SQL Server that I never would have expected. None of the differences I’ve found so far are profound, but they will pose challenges for all the developers we have (including myself) that have only worked with SQL Server.

Postgres does not have default constraints

That’s right, there is no way to make a default constraint on a Postgres table, instead you make a default value, which cannot have a name assigned to it. In SQL Server, you can define a default constraint in essentially the same way as you would define a unique or key constraint and you can give that constraint a name. In Postgres, you simply specify that a column has a default value when adding that column to a table. In theory (I haven’t tested it yet), the generation of default values works exactly the same between the two engines, one just isn’t saved as it’s own script or file with a name.

The index differences are amazing (and confusing)

I’m not going to lie, the prospect of figuring out when to use what kind of index with Postgres is daunting to me. Deciding which type of index to use with SQL Server is very straightforward, since you only need to know if you already have a primary key/clustered index on the table already and then go from there. You pretty much only have clustered or non-clustered indexes in SQL Server. But Postgres has a handful of different possible options and you need to better know how the data will be queried in order to pick the best one. The default/usual best option seems to be a B-Tree index, which is comforting and familiar, but one day I know I’ll likely have to figure out the other options as well.

The default length of object names is really short

Of these three new items I learned recently, I think this one is going to be the most annoying. The name length for objects in Postgres is surprisingly low, only 31 characters. The max length for objects in SQL Server is much longer. I’m mostly concerned that we’re going to need to figure out how to rename many long object names as we migrate off SQL Server, and that could be tedious. But we’ll see how that goes. I read somewhere in my research that you may be able to change this default length, so that will need to be something else for me to research.

Differences Between Postgres & SQL Server

A couple of weeks ago, one of my coworkers sent our group an article from Brent Ozar that they found interesting. The article is “Two Important Differences Between SQL Server and PostgreSQL”, which is a short and sweet discussion about two differences between the database engines that might trick or confuse a developer who’s learning PostgreSQL after working with SQL Server (someone exactly like me). After I read his post, I went down a rabbit hole of researching all of the little things left unsaid in the article that I hadn’t thought about yet since I’m only beginning my transition into Postgres. I recommend you go read that article, but here are the takeaways I learned from it as well as the extra information I learned from my additional googling.

CTEs are very similar but also different in Postgres vs. SQL Server

As Brent Ozar says in his post, CTEs are fairly similar between Microsoft SQL Server (MSSQL) and Postgres, but the main difference is in how the queries are processed by the engines which can affect how you should format your queries and their filters. If you don’t change your CTE writing style when switching to Postgres, you may end up with performance issues. This scares me a little simply because I’m sure this isn’t the only “gotcha” I’ll encounter while transitioning our systems to Postgres. However, I will always look at the whole project and each new challenge as a learning experience so that I don’t get overwhelmed by the mistakes I’m bound to make.

The one good thing I learned in my research about Postgres CTEs after reading Brent’s article was that I can continue to use Window Functions just the same with them, which is usually why I use CTEs with my current work anyway. I use the ROW_NUMBER function with CTEs quite often, so I was happy to see that the function also exists in Postgres.

I have taken simple query formatting for granted my entire career

After reading Brent’s post and the other subsequent articles to learn more about Postgres DO blocks, I came to the realization that I’ve really taken for granted the nice and simple query formatting, for both ad hoc and stored queries, that SQL Server and T-SQL provide. Even if I’m running a one-time query for myself to get data needed to develop a stored procedure, I use IF statements quite frequently and that is so simple to do with SQL Server. I only need to write “IF… BEGIN… END” and I have a conditional code block that I can run immediately to get the data I need.

Doing that in Postgres is a little more complicated, especially if I’m wanting these conditional statements to give me a result set to review since that isn’t possible with DO blocks in Postgres. In order to run conditional statements with Postgres, you need to use a DO block to let the query engine know that you’re about to run conditional logic. That itself is simple enough to adapt to, but there was one aspect of these code blocks that confused me for too long, which was the “dollar quote” symbols often seen in Postgres code examples. These were confusing because I wasn’t sure why they were needed or when to use them and everyone used to working with Postgres took that information for granted and never explained it with their examples.

In Postgres, when you define a function or DO block, the code within that block must be encapsulated by single quotes, which is different from SQL Server which doesn’t require any kind of quote encapsulation for the code contained in an IF statement. Most developers use the dollar quote styling method instead of single quotes because doing so prevents you from having to escape every single special character you may be using in your code block, such as single quotes or backslashes.

DO $$
<code>
END $$;

Additionally, it is possible to specify tags between the beginning and ending double dollar signs (ex: DO $tag$ <code> $tag$) which can help you organize your code to make it easier to read. As far as I can tell, the tags do not provide any function besides styling code for easier reading.

Once I understood these dollar quotes better, I felt more confident about being able to write Postgres SQL code. But there were still a few other catches with the code that I wasn’t expecting, coming from MSSQL.

Postgres is very specific about when you can and can’t return data from your query/code

Of all the Postgres info that I have learned so far, I think this one is going to get me the most due to how I write queries on a day-to-day basis. As I said above, I write a lot of ad hoc queries throughout the day to view and validate data for many purposes. Postgres’ rules about what you can return data from are much different than MSSQL. In SQL Server, it seems like you can return/select data from anywhere your heart desires, including functions, procedures, conditional statements, and ad hoc queries. Postgres only allows you to return data from two of those options, functions and ad hoc queries (as long as it doesn’t include conditional logic).

If you have a chunk of code that you want to save to run again later in Postgres, and that code returns data with a SELECT statement or a RETURN statement, you must use a function and not a stored procedure. Stored procedures in Postgres are only meant to perform calculations or to do anything else besides return data. If you put a SELECT statement in a procedure in Postgres, it will NOT display that data like you would expect or like you are used to with SQL Server. This will affect my organization greatly when we move to Postgres because we have a lot of stored procedures whose sole purpose is to return specified data with a SELECT statement. All of those procedures will need to be converted to Postgres functions (which isn’t that big a deal, but it definitely needs to be kept in mind).

You also are unable to return data from a DO block with conditional statements as Brent Ozar mentioned in his post. If you want to have conditional logic that returns data for you to review, that will need to be done with a function. This one is a little crazy to me since that is the opposite of what’s possible with SQL Server. I can run this statement without issue as an ad hoc query in SQL Server, but it wouldn’t work in Postgres DO block:

DECLARE @Today VARCHAR(10) = 'Friday';

IF (@Today = 'Friday')
BEGIN
	SELECT 1 AS TodayIsFriday
END
ELSE
BEGIN
	SELECT 0 AS TodayIsFriday
END

This will take getting used to.

Postgres has the ability to understand other programming, scripting, and query languages

This is the most interesting new thing I learned about Postgres with this research. It is fascinating to me that this database engine allows you to easily write and execute code written in other languages and that it will work just as easily as normal SQL code. When you write a function in Postgres, you are able to specify what language you are writing the code in, which can be many different languages. If you don’t specify any, it will default to standard SQL.

Postgres comes with four different language modules installed, which are PL/pgSQL, PL/Tcl, PL/Perl, and PL/Python. The “PL” in those module names stands for “Procedural Language”. Tcl and Perl aren’t as interesting to me personally since I’ve never heard of the first and have never used the second, but the other two built-in language options intrigue me.

PL/pgSQL interests me because that’s the standard Postgres SQL language module available which gives you a lot of coding features that standard SQL doesn’t provide, such as custom functions, procedures, complex computations, etc. The types of functionality that I assumed were normal for database engines since they’re built-in to T-SQL/SQL Server.

PL/Python also interests me since I’m learning Python at the moment, in addition to learning Postgres, and it seems like the most useful scripting language I’ve ever used. If you can integrate Python with the database engine so easily, I can see getting a lot of use out of that to pull and analyze data. But I haven’t yet used this functionality with Postgres so I can’t say for sure.

Conclusion

Overall, I am excited to learn these new things about Postgres because I am starting to feel more ready for our migration from SQL Server to PostgreSQL. As I’ve written in posts before, I am a natural procrastinator and had been procrastinating starting to learn Postgres until my coworker sent me the Brent Ozar article. I am super thankful that he sent that, because it led me down this wonderful rabbit hole of learning many differences between Postgres and SQL Server that I’m sure will benefit me going forward.

The below list is all of the resources I used while learning the above information. I thought the EnterpriseDB link was the best overall summary of different topics while the rest really get into the weeds of different Postgres features.

  • https://www.postgresql.org/docs/current/sql-createprocedure.html
  • https://www.enterprisedb.com/postgres-tutorials/everything-you-need-know-about-postgres-stored-procedures-and-functions
  • https://www.postgresql.org/docs/current/sql-do.html
  • https://stackoverflow.com/questions/12144284/what-are-used-for-in-pl-pgsql
  • https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-DOLLAR-QUOTING
  • https://www.geeksforgeeks.org/postgresql-dollar-quoted-string-constants/#
  • https://www.postgresql.org/docs/current/tutorial-window.html
  • https://www.geeksforgeeks.org/postgresql-row_number-function/
  • https://www.postgresql.org/docs/8.1/xplang.html
  • https://www.postgresql.org/docs/current/plpgsql-overview.html
  • https://www.postgresql.org/docs/current/sql-createfunction.html