Tag: Git (page 1 of 1)

How to Squash commits with Git Bash

I have been working with Git for the last several years, but in my current position, I am having to do more of the manual work with Git to ensure my commits meet our branch policies when pushing, since my current company has stricter rules on the pipelines. One of the Git activities I’ve found myself doing nearly every week now is to Squash my commits. While initially learning how to do this, I found some resources online that were somewhat helpful, but as with most documentation, it seems the authors assumed some level of basic understanding of Git that I did not possess. I understand it now that I’ve been doing it so frequently, but want to make a concise post about how to squash commits with Git Bash.

What’s in this post

What is a squash commit?

To squash commits means to combine multiple commits into a single commit after the fact. When I code, I do commits every so often as “save points” for myself in case I royally screw something up (which I do frequently) and really want to go back to a clean and working point in my code. Then when it comes time to push to our remote repos, I sometimes have 5+ commits for my changes, but my team has decided that they only want to have squashed commits instead of having all that commit history that probably wouldn’t be useful to anyone after the code has been merged. That is when I need to combine and squash all my commits into a single commit. Squashing is also useful for me because while I am testing my code, I copy any necessary secrets and IDs directly into my code and remove them before pushing, but those IDs are still saved in the commit history so our repos won’t even let me push while that history is there. And squshing the old commits into a single new commit removes that bad history and allows me to push.

How to squash multiple commits

For the purpose of this post, assume I am working with a commit history that looks like this:

613f149 (HEAD -> my_working_branch) Added better formatting to the output
e1f0a67 Added functionality to get the Admin for the server
9eb29fa (origin/main, origin/HEAD, main) Adding Azure role assgmts & display name for DB users

The commit with ID 9eb29fa is the most recently commit on the remote. The two commits above are the ones I created while I was making my code changes, but I need to squash those two into one so that I can push to our remote repo. To do this, I will run the following Git command:

git rebase -i HEAD~2

That command indicates that I want to rebase the two commits before HEAD. And the -i indicates that we want to rebase in interactive mode, which will allow us to make changes to commit messages in a text editor while rebasing. When I run the command, Git opens Notepad++ (which is the text editor I specified for Git Bash) with a document that looks like this:

pick e1f0a67 Added functionality to get the Entra Admin for the server
pick 613f149 Added better formatting to the output

# Rebase 9eb29fa..613f149 onto 9eb29fa (2 commands)
#
# Commands:
# p, pick <commit> = use commit
# r, reword <commit> = use commit, but edit the commit message
# e, edit <commit> = use commit, but stop for amending
# s, squash <commit> = use commit, but meld into previous commit
# f, fixup [-C | -c] <commit> = like "squash" but keep only the previous
#                    commit's log message, unless -C is used, in which case
#                    keep only this commit's message; -c is same as -C but
#                    opens the editor
# x, exec <command> = run command (the rest of the line) using shell
# b, break = stop here (continue rebase later with 'git rebase --continue')
# d, drop <commit> = remove commit
# l, label <label> = label current HEAD with a name
# t, reset <label> = reset HEAD to a label
# m, merge [-C <commit> | -c <commit>] <label> [# <oneline>]
#         create a merge commit using the original merge commit's
#         message (or the oneline, if no original merge commit was
#         specified); use -c <commit> to reword the commit message
# u, update-ref <ref> = track a placeholder for the <ref> to be updated
#                       to this position in the new commits. The <ref> is
#                       updated at the end of the rebase
#
# These lines can be re-ordered; they are executed from top to bottom.
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
# However, if you remove everything, the rebase will be aborted.

The first comment in the document # Rebase 9eb29fa..613f149 onto 9eb29fa (2 commands) gives an overview of what the command is doing. We’re rebasing the three listed commits onto the most recent commits that’s on the remote, which will give us one new commit after that remote commit in the place of the two we currently have.

To rebase these commits, I will change the top two lines of that document to:

pick e1f0a67 Added functionality to get the Entra Admin for the server
squash 613f149 Added better formatting to the output

No matter how many commits you are squashing, you always want to leave the command for the first command in the list as “pick” and then every other commit needs to be changed to “squash”. Once you have made that change, save the file and close it. Once you close that document, it will open another text document containing the previous commit messages, giving you an opportunity to amend them. This is what my commit messages look like when the document pops up:

# This is a combination of 2 commits.
# This is the 1st commit message:

Added functionality to get the Entra Admin for the server

# This is the commit message #2:

Added better formatting to the output

# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
#
# Date:      Fri Jul 12 11:11:10 2024 -0600
#
# interactive rebase in progress; onto 9eb29fa
# Last commands done (2 commands done):
#    pick e1f0a67 Added functionality to get the Entra Admin for the server
#    squash 613f149 Added better formatting to the output
# No commands remaining.
# You are currently rebasing branch 'my_working_branch' on '9eb29fa'.
#
# Changes to be committed:
#	modified:   file1.py
#	modified:   file2.py
#	new file:   file3.py
#

I will change the file to the following so that I have a single, concise commit message (although I would make it more detailed in real commits):

Updated the files to contain the new auditing functionality.

# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
#
# Date:      Fri Jul 12 11:11:10 2024 -0600
#
# interactive rebase in progress; onto 9eb29fa
# Last commands done (2 commands done):
#    pick e1f0a67 Added functionality to get the Entra Admin for the server
#    squash 613f149 Added better formatting to the output
# No commands remaining.
# You are currently rebasing branch 'my_working_branch' on '9eb29fa'.
#
# Changes to be committed:
#	modified:   file1.py
#	modified:   file2.py
#	new file:   file3.py
#

Once you’ve updated your commit messages as you would like, save and close the file and then push the changes like you normally would. If you would like to confirm and review your changed commits, you can use git log --oneline to see that the log now reflects your squashed commit instead of what it had previously.

Note: One important standard of doing rebasing with Git is that you should not rebase changes that have already been pushed to a public or remote repo that others are using. It’s bad practice with Git to try to rewrite shared history, since keeping that history is the whole point of Git and version control. Try to stick to the custom of only doing rebasing with your own local changes.

Summary

In this post, I covered the basics of how to perform a squash commit of multiple commits using Git Bash. If this tutorial helped you, please let me know in the comments below. If you would like to read more of the details about what rebasing means, please refer to the Git documentation.

Sources

First Thoughts About Azure

As many of you probably already know, my cloud development career started in AWS, which I worked with for just about 3 years while I worked at Scentsy. Since my recent transition to a new job at a different company, I have started to develop in Azure instead, and it’s been a learning journey. Although both platforms allow for cloud development and processing, they have quite a few notable differences in what is offered and how they offer it, which is what I’m going to cover in this post today. My goal for this list isn’t to have a technical or all-inclusive list of the differences, but more of a difference a developer might feel in their own work if they make the same switch that I have.

What’s in this post:

Azure seems simpler

Azure is simpler yet still robust. Sometimes I feel like AWS tries to overcomplicate their services in order to make them seem fancier or more cutting-edge. And it also seems like they split what could be one service into multiple just to increase their total service count. Azure combines multiple functions I was used to in AWS into a single service. An example of that is Azure DevOps, which combines your ticketing/user story system with your DevOps pipelines and your Git (or other) repos. In my past job, we used TeamCity and Octopus Deploy for the pipelines, Jira for the ticketing, and Bitbucket to store our code, so I was a little confused my first couple of weeks in my new role since everything seemed to only be in one location. But I now find it nice and easier to work with.

Azure has better cloud ETL development

In the Azure cloud platform, there is a service called Synapse Workspace or Synapse Studio, and a second service called Azure Data Factory, which both allow you to create ETL pipelines right in the cloud. AWS has Glue, but that really doesn’t seem to have the same feel or capabilities that either Synapse or Azure Data Factory (ADF) has in the Azure realm. I have already updated and created several pipelines in each of those services in Azure and I really enjoyed working with them because they were very intuitive to get working with as a newbie and I could do everything I needed for the ETL right in the cloud development workspace.

When I worked with Glue in the past, it definitely did have some limited capabilities for making drag-and-drop ETLs in the cloud, but the service seemed to have a lot of limits which would force you to start writing custom PySpark code to make the data move. While writing custom code is also possible with Synapse and ADF, they both are built with more robust built-in components that allow you to make your ETLs quickly without writing any more custom code than a few SQL queries. I have really been enjoying working in these new services instead of AWS’ Glue.

More on Azure Data Factory

Another reason why I have been enjoying working with Azure Data Factory (ADF) is because it seems to be a modern version of the SSIS I am already familiar with, and located in the cloud instead of on an ETL server and local developer box. Although the look of ADF isn’t exactly the same as SSIS, it still is the drag-and-drop ETL development tool I love working with. And since it’s developed by Microsoft, you get all the best features available in SSIS ETL development without having to work with the old buggy software. I’m sure as I keep working with ADF that I’ll find new frustrating bugs that I’ll need to work around, but my experience with it so far has been only positive.

Power Automate & Logic Apps

Two other tools that aren’t available in the AWS ecosystem and that don’t seem to have an analog in AWS are Power Automate and Logic Apps. While these tools are more aimed at people who are not developers, to allow them to automate some of their daily work, they are interesting and useful features for certain scenarios and I am enjoying learning about them and playing with them. One of the best parts about working with Azure services is that it’s fully integrated into the entire Microsoft ecosystem, so you can pull in other non-Azure Microsoft services to work with Azure and expand your horizons for development. I’m not sure yet that I would 100% recommend working with Power Automate or Logic Apps for task automation (I’m still not done learning it and working with it), but it at least is another option to fall back on in the Microsoft realm that isn’t available in AWS.

Copilot isn’t what they want it to be

While most of my experience with Azure so far is positive, there are a couple annoying things I’ve noticed that I think are worth sharing, although neither of them are so egregious that it would prevent me from recommending working with this platform.

The biggest negative about Azure for me so far is that Microsoft keeps trying to shove Copilot (their AI assistance tool which seems only slightly more advanced than Clippy) into every single product they offer even when it provides no benefit or actually detracts from your total productivity. The perfect example of this is the “New Designer” for Power Automate. For some unknown reason, Microsoft has decided that instead of allowing you to do a drag-and-drop interface for task components to build your automation flow, everyone should instead be required to interact with Copilot and have it build your components instead. That might be useful if you had already been working with Power Automate in the past so knew what capabilities and components it offered. But as someone totally new to this space who is trying to learn how to use the tool and has no idea what is currently possible to develop, it feels basically impossible to communicate with the AI in any meaningful way in order to build what I want. I don’t know what to ask it to create when I’ve never seen a list of tasks that are available. Luckily, for now it is possible to toggle off the “New Designer” and switch back to the old that allows you to add each individual component as you go and select those components from a list which gives you a short description of what each does. Maybe in the future I’ll be more open to using Copilot with everything I develop, but right now, as a new developer in Azure, it doesn’t work for me.

Unintuitive service naming

The only other nitpick I have about the Azure and Microsoft cloud ecosystem is that sometimes, the names they pick for their services don’t make sense, are confusing, or are the same thing as a totally different service. Microsoft doesn’t seem to be that great at naming things to make them understandable at a quick glance, but I suppose that can also be attributed to the desire of all cloud computing companies to make themselves look modern and cutting-edge.

The best example I can give of this phenomenon right now is that a data lake in Azure is built on what are called Storage Accounts, which is the blob storage service within Azure. It’s not as confusing to me now that I’ve been dealing with it for a month and a half, but that name doesn’t seem at all intuitive to me. Each time my colleagues directed me to go to the “data lake” I would get confused as to where I was supposed to navigate since the service I would click into was called Storage Accounts instead.

Summary

Although it felt like such a big switch in the beginning to move from an AWS shop to an Azure shop, I have already started to enjoy developing in Azure. It has so much to offer in terms of cloud ETL development and I can’t wait to keep learning and growing with these tools. I’ve already compiled so many things that I can’t wait to share, so I am hoping I will get those posts ready and posted soon so others can learn from my new Azure developer struggles.

How to Set Notepad++ as Your Default Git Editor

Welcome to another coffee break post where I quickly write up something on my mind that can be written and read in less time than a coffee break takes.


When you start working with Git Bash for the first time (or you have your computer completely reimaged and have to reinstall everything again like I did recently), you will likely encounter a command line text editor called Vim to edit your commits or to do any other text editing needed for Git. And if you’re like me, you probably won’t like trying to use a keyboard-only interface for updating your commit messages. If that’s the case, then I have a quick tutorial for how to make Git use a normal text editor to more easily update your commit messages.

What is Vim?

Vim is an open-source text editor that can be used in a GUI interface or a command line interface. My only familiarity with it is with the command line interface, as the default text editor that Git comes installed with. If you don’t update your Git configuration to specify a different text editor, you will see a screen like the following whenever you need to create or update a commit message in a text editor (like when you complete a revert and Git generates a generic commit message for you then gives you the opportunity to update it, or when you want to amend an existing commit). This is what the command line editor version of Vim looks like (at least on my computer).

I personally don’t like this text editor because to use it, you need to know specific keyboard commands to navigate all operations and I don’t want to learn and remember those when I can use a GUI-based text editor instead to make changes more quickly and intuitively.

How to Change Text Editor to Notepad++

The command for changing your default text editor is super simple. I found it from this blog post: How to set Notepad++ as the Git editor instead of Vim.

git config --global core.editor "'C:/Program Files/Notepad++/notepad++.exe' -multiInst -notabbar -nosession -noPlugin"

After you execute that command in Git Bash, you can run this command to test it, which should open up Notepad++ to try to update the last commit you made: git commit --amend.

That blog post then says that you should be able to double-check your Git configuration file to see that the editor has been changed, but my file doesn’t reflect what the other post says despite Notepad++ be opened for commit messages after I ran the change statement. This is what my gitconfig file currently looks like after setting the editor to Notepad++:

So your mileage may vary on that aspect. As long as the change works for me, I’m not too concerned about what the config file looks like.

How to Revert Text Editor to Vim

If you ever want to change back to the default text editor, you can run the following command to switch back to Vim, and once again confirm it worked using the git commit --amend statement:

git config --global core.editor "vim"

Conclusion

Changing your default text editor for Git Bash is extremely simple as long as you know where the exe file of your preferred editor is stored on your computer. It’s also simple to change back to the default Vim editor in the future if you want or need to.

What is Liquibase?

Liquibase is a tool used to track and deploy changes to databases. It can connect with Git and other software, including CI/CD software, allowing your team to collaborate and track changes to your databases. There are several different software options similar to (and even better than) Liquibase, but what makes this tool unique is that it has both an open source version, which could benefit smaller organizations that don’t have as much cash to spend on database software, and a “Pro” version, which gives more features for those who need it and can pay the additional cost.

I’m currently reviewing this software and playing with it for a project at work, and I’ve found it a bit challenging to figure things out from the Liquibase websites and documentation. I’m summarizing my findings here not only for others who may read this in the future but also to help myself get everything organized and clarified in my mind. This will be a two-part series of posts, with the second providing a deep dive into how to use Liquibase with your database.

What’s in this post:

  • Why would you want to use this tool?
  • Features of open-source edition
  • Features of Pro edition
  • How to get 30-day free trial of Pro edition
  • Other options for similar tools

Why would you want to use this tool?

You want to be able to track and manage changes happening in your database

In the modern development process, all code, including database code, should be tracked using source control. Using source control enables you to have a complete history of what changed when, and who made that change. Having this level of tracking can be great for overall documentation of what you have in your systems, but it can also help immensely in troubleshooting bugs and other issues in your code to find when an issue started happening. It also enables you to quickly revert any bad or unintended changes without having to update all the code again manually. This is a well-known and well-used system in the software development realm but doesn’t seem to be quite as popular in the database realm, at least from my own experience. Liquibase can help organizations start on this journey.

You want to automatically deploy database changes to multiple environments

When you first start using Liquibase, you can generate a changelog file that acts as a baseline for recreating the database, with the file containing all of the SQL queries necessary to recreate all objects within the database. Then, as you make more changes, those will be added to one or more files that can be tracked with Git or other source control to help with deployments and change tracking of your system. If you are looking for a way to automatically deploy your database changes through pipelines, Liquibase should be able to help with that, although I haven’t gotten that far in my project so I can’t say for sure. But based on how the tool is set up, you should be able to use the software on a build and deployment server and have it execute the changelog scripts you create on databases in other environments. You can read more about the integration with CI/CD systems on the Liquibase website.

You need a database change-tracking tool that works with multiple database engines

On the Liquibase website, they claim they can work with over 50 database engines, which is a huge claim to make and an accomplishment to have. There are very few other options for this type of tool that can claim to support more than just SQL Server, let alone 49 other database options. I am looking at this tool exactly for this reason because I need to be able to manage database changes on at least SQL Server and PostgreSQL going forward, and perhaps others if needed since it supports so many different options. At this point in my current tool exploration project, I have only played around with the tool and how it works with Postgres since that’s the future my department is looking at, but I’m sure it would be just as simple to set it up to work with a SQL Server database as it has been with Postgres.

Features of the Open-Source Option

I’m not going to go through and list every single one of the features of the open-source version of Liquibase because you can see that on their website. Instead, I will discuss the features I am interested in and why I think it’s interesting that those features are available completely for free.

Run Preconditions Before Executing SQL Changes

The concept of the preconditions interests me because it almost combines a custom system my department uses to validate data with the change tracking and deployment of changes. Although I haven’t had a chance to use the preconditions in my testing yet, it looks like a cool feature because it allows you to validate data using a SQL statement before the next change query is executed. For example, if you are adding an ALTER TABLE statement to your changelog file to drop or change a column, you can first add a precondition that will check to see what the data looks like in the column to ensure it doesn’t run if data exists in the column. With the precondition for SQL files, you specify an expectedResult value and then the query you want it to run, and if the results of the query don’t match the expectedResult value, it will fail the execution of the file and the changes won’t get deployed. This would be very useful for ensuring you’re only running code in the scenarios you want it to be run in.

Preview SQL Changes Before Running Them

I’ve already used this feature in my testing to see what exactly Liquibase is doing under the hood when I run certain commands and apply changes, and it seems useful to me. I, like a lot of database developers, am paranoid about knowing what exactly I am about to execute on a database, so Liquibase provides that capability with the command “update-sql” which will print out in the CLI all of the SQL queries that will be executed when you apply the existing changelog and changesets to your database, including the ones that the tool runs in the background to track the changes in the two log tables it uses.

Automatically Detect & Script Change for Non-programmable Objects

When I was first working with Liquibase for my proof of concept project, I thought that the command to automatically generate changelogs based on the current state of the database was only available in the Pro version. But I finally figured out that my assumption was incorrect, you can use the “generate-changelog” command even on the open-source version, but it will only generate queries for what I call non-programmable objects, everything but functions, procedures, etc. If you are working with the open-source version of Liquibase and use the “generate-changelog” command, it will generate a new changelog (or overwrite an existing one if you specify an argument) containing SQL queries to recreate every object in your database that isn’t one of the programmable type objects.

I was happy to find this command in the Liquibase arsenal because it’s a working method that my department uses regularly with our current database change management and source control tool. The normal workflow for a developer making a database change is to go into the database, make the change using SSMS, and then use our existing software to automatically detect the change and generate a migration script to track and implement that change upon deployment. This was a workflow that I was hoping to not lose as we change tools, and it looks like we won’t lose that if we switch to Liquibase. In my next post, I’ll cover both of the two different methods of creating and tracking changes with Liquibase, which each have a different angle of attack for where to make the change.

Features of the Pro Edition

I have been working with a 30-day free trial of the Pro edition of Liquibase to see what features it offers in that edition that we might need. If you would like to see a full list of the upgrades you can get with Liquibase Pro, visit their website to compare it against the open-source version.

What I found while working with the tool is that you do not need the Pro edition to use the command “generate-changelog” like I originally thought. If you would like to be able to automatically script out programmable objects like stored procedures and functions, you will need the Pro version for that. However, if you only want this tool to track changes to tables, foreign keys, primary keys, and other constraints (see the documentation for the entire list), then you would be fine to use the open-source edition of Liquibase.

Another big feature of the Pro edition that most people are probably interested in is that it is the only version of the software that you can integrate with CI/CD tools to do automatic deployments of the code you’re tracking with the tool. This feature would be a big one for my company since I doubt we are okay with going back to manual deployments of database code across dozens of databases in multiple environments.

The final reason you might choose the Pro edition over the open-source edition is that you can get a much greater level of support from the company with the Pro version (which makes sense). Pro comes with what they define as “standard” support which is through email. If you would like more advanced and involved support, such as 24-hour emergency help, you can add that to your subscription for an additional cost.

How to get a 30-Day Free Trial of Pro

If you would like to try out the Pro version of Liquibase to see if it meets your needs, they do offer a 30-day free trial through the website. You will need to go through the process of giving your email and other contact information, as well as setting up an appointment with one of their representatives to get the license key. In my experience, since my company is seriously considering this tool as an option and I can’t do the contact or negotiation for licenses (someone else handles that), I went through the meeting setup process but then contacted the representative and told them I would be unable to make the appointment, and still received the trial license through an autogenerated email. As of writing this, I still haven’t been bombarded by sales emails or calls from the company which I appreciate. It’s nice to be able to try out the full version of the tool without being harassed for it.

Other options for similar tools

While there are other very expensive options for database change management and tracking software available, Liquibase seems to be unique in that it offers a lot of useful features without requiring you to pay for them. They outcompete the other free options by a long shot since they can work with different database engines. It’s hard for them to compete with options like Red-Gate SQL Change Automation or Flyway since those tool suites are robust and expensive, but they are a viable option for people who don’t want to be stuck paying thousands of dollars per year for this type of software.

To see a more complete list of alternatives for this type of software, you can review the list that originally helped me in my search at DBMSTools.com.

Want to learn more details about how to work with Liquibase?

Next week, I will be posting about how specifically to set up and work with Liquibase since the startup documentation on their website was scattered and hard to move through as a beginner with the software. I thought I would make it easier for others to learn how to work with this tool by posting the notes I made while working with it.

How to Clean Up Old Local Branches With Git

If you use Git Bash or another form of Git on your local development machine for version control in your organization, have you ever looked at how many branches you have from old work? Sometimes I forget that Git is keeping every single branch I’ve ever made off of master in the past unless I manually go in and delete it. This means I end up with an insane number of local branches hanging out on my computer, some of them months old. It’s not necessarily bad that these local branches are still around, but I know that I will never need them again after the related ticket or software change has been deployed to Live. Any pertinent information that might be needed for reference for that branch is stored in our remote repo which means I don’t need it hanging around on my machine.

When I finally remember to check how many local branches I have on a repo (using the command “git branch”), I am shocked to see dozens upon dozens of branches like in the above screenshot (which is only about half of the old branches I have on that repo). Then I want to get rid of them but also don’t want to use “git branch -D <branch>” for every single individual branch to clean them up one by one since that would take me quite a while to complete.

The faster way to get rid of all local branches, taught me by a coworker, is the following: “git branch  | grep -v “master” | xargs git branch -D”. Note: use this with caution because it will delete everything and you don’t want to delete something that you still need. Also, there are some caveats with which this command won’t work, and you can read more about that on StackOverflow.

TL;DR: the above command will fetch the list of all branches available on the current directory/repo, will get all branches except the one you specify with “grep -v” (so that you can keep the local master branch), and will then delete all of those branches with a force delete.

Let’s break down each part of that command:

  • “Git branch”
    • This is the command that will list all local branches on the current repository
    • Using a pipe character (the vertical bar “|”) will tell the command to feed in the results of what’s on the left of the pipe into the command on the right of the pipe, which in this case means we are feeding the list of local branches into the command “grep -v “master””
  • “grep -v “master””
    • The grep command will print output matching a specified pattern
    • The option “-v” signifies that the inverse of the list matching a pattern should be output
    • In this scenario, the two above points mean that this full command is going to take the list of all local branches and print out all of them that aren’t the specified branch, which in this case is “master”. If your main branch isn’t called master, you can change that value to whatever branch you don’t want to delete with this command.
  • “xargs git branch -D”
    • I haven’t been able to definitively figure out what the xargs command is doing (if anyone has documentation on this, please send it my way!), but essentially it seems to be taking the list of branches created with the two previous commands and running that list through the normal “git branch -D” command which will perform a hard delete on those branches.
    • “git branch -D” is the command used to force a delete of a branch (the -D is short for using the two options “–delete –force”)

This isn’t the most necessary Git command you’ll ever use in your work, but it does come in handy to keep your work organized and decluttered if you’re someone like me who values that.