Tag: Database Tools (page 1 of 2)

TechCon365 Day 2 Recap

Today at TechCon365 in Atlanta, I did another full day session of the pre-con, this time covering everything in the realm of Microsoft Fabric. Before today, I knew hardly anything about Fabric, aside from it being something that Microsoft was pushing our team to use for a new project I wasn’t involved with. The workshop, which didn’t have any hands-on work but was still good, was led by John White and Jason Himmelstein. The two speakers did an excellent job at teaching all aspects of Fabric without being boring. This was the introduction to Fabric as a concept that I’ve been needing for months.

Note: The content below is not guaranteed to be accurate, it’s just what I took away from the workshop. In the coming months, I will be learning more about these tools myself to fact check everything and see if the new tools will work for my team.

What’s in this post

Goodbye Synapse

For me, the biggest takeaway of things I learned during today’s workshop is that Microsoft intends to get their customers migrated off Synapse as the data movement platform and into Fabric and its OneLake. As of now, Synapse as a product no longer has a product manager (according to the presenters), which means that Microsoft does not intend to make any feature updates to that tool. There is still a maintenance team for the product to fix any issues that may arise and to keep it running, but there are no long term plans for it anymore.

Synapse going away is concerning to me because my company only recently migrated to Synapse off of SSIS. My fear is that if we spend a ton of time and energy converting our ETL systems into Fabric, that as soon as we get the work done, we’ll need to start anew with whatever product line Microsoft releases next, just like what we just did with SSIS to Synapse. I understand that cycles like this are everyday occurrences in the technology sector, but I also know that my boss and the executives at my company are likely not going to be happy when I tell them what I learned in today’s workshop.

If you’re in the same situation my team is in, you are totally in Synapse and maybe just wrapped up getting there so are wary to move on to the next transition to Fabric, don’t worry too much. The presenters did assure us that Synapse will be supported for a while yet, no concrete plans have been announced to retire it. Based on that, John and Jason recommended that teams stop doing any new development using Synapse and instead start to do new development in Fabric to work as real-world test cases of how useful it can be. I haven’t yet done anything like that, so the usefulness is still unknown to me personally.

Hello Fabric

Fabric is being sold as the one stop shop for all things data in the Microsoft universe, the future of all data analytics and processing. No longer is Fabric going to be Power BI 2.0; now it will serve as the location for anything you need to do with data. While the sales pitches are great, I am wary of the transition, like I said above already. I dislike when companies claim that their new tool is going to solve all your problems, because nothing is ever that great in reality. But here is what I learned today.

OneLake: Put every speck of data here

With each tenant, Microsoft gives you a single OneLake, because it should be your one data lake for your organization, just like there is only a single OneDrive for each user. No more having dozens of storage accounts to act as separate data lakes for different processes. Now you get the one and you will like it. More details I learned:

  • Parquet files: the reigning champions
  • A “Shortcut” to the source data
    • If you want to get data from a source system, like your Oracle finance database for PeopleSoft finance, you can add a “Shortcut” to the data source without having to pull the data into OneLake with a normal ETL process.
    • This concept is the most intriguing to me, and I really want to see how it plays in real life with real data and systems. Could be too good to be true.
  • Security redefined
    • They’ve developed a new way to manage security for the data that is supposed to reduce overhead. They want to have you manage security in only one place, just like you’ll be managing all your data in one place with OneLake.
    • This feature is still in preview, so remains to be seen if it works as well as Microsoft claims
  • File Explorer: An old but good way to explore your data
    • A new plugin for Windows has been created that will allow you to view your OneLake data through the Windows File Explorer, like they’re normal files. Which they are, if they’re Delta-Parquet flat files. The new feature is OneLake File Explorer.
    • You can also view your OneLake data files through Azure Storage Explorer, which is a good second option as well since my team already uses that for other purposes.

For the OneLake concept, I like that it is trying to prioritize reusing what you already have–tools, data, reporting frameworks–so that you don’t have to keep reinventing the wheel for each new data flow process you need to create. The concepts are good, so I’m eager to use them for myself after the conference.

Lakehouse

This is a new way to store data that is offered exclusively by OneLake in the Microsoft ecosystem. It combines the concepts of data lakes and data warehouses by using Delta Parquet format files. It gives you the ability for ACID compliant transactions on the data files while also still having compressed and cheap data storage like a data lake.

Data Factory

This is the part of Fabric that is replacing Synapse pipelines. In the demos during the workshop, I could see that the Dataflow Gen 2s and the data pipelines look almost exactly like what is currently available in Synapse, which is promising, because then at least I don’t need to learn a new GUI. Like with Synapse, the Dataflows are still backed by Spark engine to allow for parallel processing and high throughput. Both tools allow you to move data from point A to point B (the OneLake) for the scenarios where Shortcuts won’t work and you still need to literally move data between systems.

Fabric Native Databases

If the other new data storage types in OneLake and Fabric weren’t enough for you, there is now also going to be Fabric-native databases available for use. These will be SQL Databases (and some other database types) that you create directly in Fabric instead of creating them independently then joining them to the OneLake with a Shortcut. This change to Fabric is what they are intending to use to take Fabric from a data analytics platform to a full data platform. However, the presenters of today’s workshop did caution that they don’t recommend putting your production systems into these native databases yet, since they’re still new and not well-tested in the real world. But maybe in the near future, we truly will be doing 100% of our data integration engineering in Fabric.

Summary

With this technology, it feels like we could be at the cusp of a new big wave of change for how we operate in the data realm, so I am really interested to see how things play out in the next year or two. Using Fabric for all things data could be the next big thing like Microsoft is trying to sell, or maybe it will simply fade into being just another option in the existing sea of options for data manipulation and analytics in the Microsoft ecosystem.

Do you think Fabric is going to be the the next big thing in data technology like Microsoft claims?

Related Posts

How to use Logic App to send email from Synapse Pipeline

A task that I completed a while ago that I wanted to share was creating two different pipelines for a business user to load data to and from one of their databases to another server for monthly processing. The business user requested that they receive an email if the pipeline failed so that they would know sooner rather than later that there had been a problem with processing. Unfortunately, there isn’t currently a way to natively send an email from a Synapse Analytics pipeline, there isn’t a built-in activity for that (that I’m aware of). To enable the functionality requested by the business users, I had to create a custom system to send an email.

What’s in this post

Overview of the process

Since there isn’t a native activity within Synapse Analytics that allows you to send an email, the workaround system I created is to make a Logic App that will send an email when triggered by an HTTP request, then add an HTTP activity in the pipeline to trigger the pipeline and feed through error information to the Logic App to generate a useful email. Keep reading to learn the details of how to setup such a process.

Creating a Logic App to Send an Email

1. Click on the component “Add a trigger” in the designer. In the search box, type “request” and then select the option that says “When an HTTP Request is received”.

    2. Click on the link to “Use sample payload to generate schema”

    3. Enter a simple JSON-formatted string containing the values you will need to use to send the email

    4. Click “Done” when you’ve included all the parameters you want to include in your incoming JSON values, and your simple JSON will be automatically formatted in the exact way needed for the payload to handle the request.

    5. Choose the METHOD that will be used for the HTTP request, select “POST”

    6. Click into the open area of the designer to close the editor for the HTTP request component then click the “+” under the component to add another to the flow

    7. In the search box, search for “outlook”, scroll down until you see the section for “Office 365 Outlook” and click the component for “Send an Email (v2)”

    8. If you haven’t logged into an email account through the Logic App yet, it will request that you do so in order to start specifying how the email should be sent. Go through the process of signing into an email account. I recommend using a service account email instead of your own email, unless you’re the only person that will ever be working on this process in the future.

    9. Click into the “To” box in the properties of the “Send an Email” component, and you’ll see two blue icons pop up to the left of the box that will allow you to dynamically enter information into that text box. Click the top icon that looks like a lightning bolt to pick values from the previous step of the process.

    10. In the box that opens, select the “to” value from the HTTP request component to fill into the “To” box of the email properties.

    11. Repeat the same steps for the Subject of the email and the Body of the email.

    Adding the Email Capability to your Synapse Pipeline

    After saving the Logic App, you will be ready to use it from your Synapse pipeline. Once saved for the first time, you will now see that a URL has been generated for the HTTP Request activity which you will use to call the Logic App from your pipeline. Copy that URL.

    Return to your Synapse pipeline so that we can set it up to call the Logic App to send an email. Since we essentially created an API which we can hit to send an email, we need to add a “Web” component to our pipeline. In my case, I want to send an email to a business user when the pipeline fails since they are the key party interested in the pipeline failing. For that, I have added my “Web” component as the “On Failure” route of my pipeline (red “x”/line).

    Open the “Web” activity to set it up to hit your new API. Give the Activity whatever name you would like, in my case it is named “Send Failure Email 1”. Then go to the “Settings” tab and paste the URL into the URL field. Then for “Method”, choose “POST” since we want to send information to the URL from our pipeline.

    Next, we need to write the body of text to send to the API. This Body will need to match the formatting that we specified in the Logic App when we configured the JSON Payload. In that payload, I specified the following parameters should be required: To, Subject, and EmailBody. Based on that, I have written my Body for the Web activity as the following:

    {
        "to":"@{pipeline().parameters.EmailRecipient}",
        "subject":"@{pipeline().parameters.EmailSubject}",
        "emailbody":"Pipeline failed when trying to copy data for table @{item().TableName}."
    }

    When working with the expression builder in Synapse, you need to be very particular about formatting when using pipeline parameters or variables in an expression. Pay close attention to the double-quotes, the curly brackets, and the @ symbols. In my HTTP Body, I set the values of two of the parameters I send to the Logic App through pipeline parameters so that I only had to change them in one place and not in every iteration of the email component in my pipeline. If you don’t want to use parameters like that, you could hard-code the values of the key-value pairs in the body to whatever you would like, similar to how I set the value of emailbody.

    Click “OK” to close the expression builder panel. The very last thing we need to add to our “Web” activity is the Headers of the POST HTTP message, we only need to add one. Click the “+ New” button to add one, then enter the Name as “Content-Type” and the Value as “application/json”.

    Testing the Pipeline Email Capability

    Once you have finalized the addition of the “Web” activity in your pipeline, you are ready to Debug and test the changes, which you can do in the manner in which you normally test your Synapse pipelines. For me, I knew my pipeline was setup successfully in the way I wanted when I created a false failure scenario (since the email only sends when a component fails), I ran a debug, the previous component failed and I received an email containing the failure information that I wanted. Your testing may look slightly different.

    Summary

    Although Synapse Analytics doesn’t have a built-in feature for sending an email from a pipeline, you are able to build that functionality yourself fairly easily using a Logic App and a Web activity in your pipeline. I would love to hear if you found this method useful or implemented it yourself. Thanks for reading!

    Resources

    https://omnidata.com/how-to-setup-notifications-within-synapse-pipelines-using-logic-apps/

    Related Posts

    Using Custom Script Extensions with Bicep Templates

    After a previous in-depth post about creating a Bicep template to deploy an Azure Infrastructure as a Service (IaaS) virtual machine, which included the use of a Custom Script Extension (CSE), I want to go more into depth about some of the trials I went through while trying to get the Custom Script Extension to work. Troubleshooting it wasn’t the easiest thing in the world, since Azure doesn’t give much feedback at all from the failure to deploy that resource from a Bicep template, so I am hoping I can give anyone else struggling to implement the Custom Script Extension resource to run a PowerShell script after deploying an Azure resource some assistance with their process.

    This post only has the code for CSE itself, and not the rest of the template I use it in. If you would like to see the full template, please view it on my GitHub.

    What’s in this post

    What is a Custom Script Extension resource?

    In the Bicep world of infrastructure as code templates in Azure, there is a resource type that is called a “Custom Script Extension” which isn’t so much a “resource” as you would normally expect in a Bicep template. Normally, a “resource” is what it sounds like: an Azure resource you want to create, like a server, network security group, database, and even things like role assignments. But for the Custom Script Extension, it’s a method of running a PowerShell script on a Virtual Machine (VM) resource after it has been deployed, to install software or do other necessary setup on the machine.

    My use case for a Custom Script Extension

    The reason why I use a Custom Script Extension in my Bicep template when I create an Azure IaaS VM is so that I can initialize the disks related to the VM so they can actually be seen and used like you would expect. For some reason, when you create an Azure VM with a Bicep template, it does not automatically join the created disks to the machine. Due to this, when you first log in to the machine after it has been created, you won’t see the disks in the File Explorer like you would expect. Thankfully, PowerShell and the Custom Script Extension allow me to initialize those disks and name them what we normally want them to be named without having to login to the server and do it manually myself.

    I originally had my PowerShell (PS) script set to also join the machine to our corporate domain after it was created, but I recently removed that part because it would not work with the protectedSettings section of the extension description, which you’ll see below. If you want more details about why I couldn’t get this to work and thus had to remove the most problematic sections from my PS script, keep reading.

    My setup for a Custom Script Extension

    The following is the Bicep resource script I used as part of my wider IaaS VM creation script to setup the VM after it is deployed.

    var initializeDisksScript = 'https://stgAcct.blob.core.windows.net/myContainer/SetUpWindowsVM.ps1'
    var sqlExtensionName = 'SqlIaasExtension'
    
    resource customScriptExtension 'Microsoft.Compute/virtualMachines/extensions@2024-07-01' = { 
      name: customScriptExtName
      location:'westus2'
      dependsOn:[sqlVM]
      properties:{ 
        publisher:'Microsoft.Compute'
        type:'CustomScriptExtension'
        typeHandlerVersion:'1.2'
        settings:{ 
          fileUris:[
            initializeDisksScript
          ]
          commandToExecute:'powershell -File SetUpWindowsVM.ps1'
        }
        protectedSettings:{
          storageAccountName:storageAccountName
          storageAccountKey:storageAcctKey
          
        }
      }
    }

    A quick explanation of that definition is that it’s creating a resource of an extension type for VMs, and it’s dependent upon the VM which I specify further up in the full Bicep template. The script extension is set to execute a PowerShell command so that I can run a file called SetUpWindowsVM.ps1, which is downloaded by the script runner from the storage account location specified in a variable called initializeDiskScript. There are two different sections of “settings that you can specify: a normal settings section whose values will be output to the log after deployment, and then a section of “protected settings”, whose values do not get output to the log after deployment.

    How the Custom Script Extension Works

    After the Bicep template has created the VM, it will then set about running the specified script file I designated in my CSE resource definition. The first step to do that is to download the file from the specified fileUris location, which for me is an Azure Storage Account. The extension is able to connect to that Storage Account, since I provided the name and access key in the protectedSettings, and then download the file from there onto the local machine. The general location it’s downloaded to is: “C:\Packages\Plugins\Microsoft.Compute.CustomScriptExtension\1.*\Downloads\<n>”, where “1.*” is the version of Bicep you’re using and the “<n>” is a seemingly random integer value that the extension picks. For me, that was always “0”. After the file is downloaded, the CSE handler tries to execute the commandToExecute that you specified in your Bicep template. Since the PowerShell file will be downloaded locally to the area that CSE expects to use it from, you do not need to specify the full path to the file location, you can use the relative path formatting.

    If you’re having issues with the CSE, like I was, you can get insight into what happened when the CSE ran by viewing the logs in this location: “C:\WindowsAzure\Logs\Plugins\Microsoft.Compute.CustomScriptExtension”. For more on that, see section below.

    Issues with the Custom Script Extension

    As of the time this was posted, I’ve been going back and forth with Microsoft support for several weeks to figure out how I could possibly use the commandToExecute specification in the protectedSettings object of the resource definition, and I have not yet resolved the error while working with them. The issue I am having is that the PowerShell script that I actually want to run includes the usage of a parameter containing a password, so I should use the protectedSettings to pass in the command with the password parameter so that the password is not output to the deployment log in plain text after the template is deployed. However, if I put the commandToExecute into the protected settings, nothing seems to happen and the script is not executed. If I put the same exact command into the normal settings, the script completes successfully yet my password it unsecurely output to the log, which I do not want.

    Since I haven’t been able to resolve this problem, even with the help of Microsoft support, I have updated my PowerShell script to remove the section that joins the machine to the domain, which removes the need for me to pass through a password, so I can use the normal settings section to successfully call the commandToExecute. I am going to continue working with Microsoft support to see if we can finally come to a resolution, but I didn’t want to keep waiting on this post. If I ever fix the problem, I will update here.

    Troubleshooting help

    As I mentioned in the section above, you may run into random issues with the Custom Script Extension of Bicep (CSE) if you include it in your templates. Overall, I think it is still worth using this resource type, but you do need to be armed with some knowledge to help yourself as much as possible. These are the things I found useful when troubleshooting different issues with the CSE.

    • You will need to delete your VM and redeploy it so many times while troubleshooting issues, so be ready for that. Thankfully, deleting a VM and all its associated resources through the Azure Portal has gotten a little easier recently, so that will save you a little time.
    • If you are unsure whether the problem is with your PowerShell script or your Bicep template, create a test version of your VM as it would be created by your template (or run the template to create the VM then log on to that) and manually run the same PowerShell script on the machine. If it runs when you execute it manually, then the issue is not the PowerShell script but with the CSE.
    • Do not create the CSE nested under the resource creation specification for the VM, list it as its own separate resource definition, like I have in my template. It’s much harder to understand what’s happening if you put it as a nested resource, and then you can’t troubleshoot it on its own because the entire VM creation will fail if the CSE fails.
    • Make sure you specify the dependsOn property in the CSE resource definition, or else it will likely get deployed out of order. Bicep is supposed to be smart enough to know that some things should be deployed before others, but it doesn’t seem to understand order of operations for the CSE.
    • To view the logs of the CSE after it’s been deployed to get a sense of what was happening, you can see that by going to the Deployments section of the resource group you deployed the template into. Open the specific deployment you created for your template, then click on the link for the CSE resource deployment.
    Screenshot showing the deployment overview where you can find more details on the CSE resource deployment
    • Check to make sure the file downloaded onto the local machine by the CSE handler is the correct file. It would be in this general location on the machine: “C:\Packages\Plugins\Microsoft.Compute.CustomScriptExtension\1.*\Downloads\”. I never saw it be wrong, but it doesn’t hurt to double-check.
    • Check the execution logs of the CSE after it runs on the machine. You can find all logs in this location on the machine after it’s been created and the CSE executed: “C:\WindowsAzure\Logs\Plugins\Microsoft.Compute.CustomScriptExtension”. The most useful log to me was the one called “CustomScriptHandler”, since that shows the exact steps the CSE handler ran when it executed your script (or tried to execute it).
    • Set your PowerShell script to generate a log when it runs, because that will either tell you what happened and went wrong in PowerShell, or it will not be created at all like in my situation, showing that the CSE is the problem.
    $scriptName = 'customScriptExtensionLog'
    $scriptName = $scriptName.Split('.')[0] 
    $runtime = (get-date -Format 'hh-mm-dd-MM-yyyy') 
    $logfile = "D:\" + "$scriptName" + "-" + "$runtime" + '.txt' 
    Start-Transcript -Path $logfile

    Mostly, you will need a lot of time and patience to properly troubleshoot issues when the Custom Script Extension goes wrong.

    Alternatives to this resource type

    While I haven’t used this yet, there is an alternative resource type availabe to use instead of the Custom Script Extension (CSE), and it’s called RunCommand. If you would like to read more about this resource type to see if it would be a better fit for you, there is a great Microsoft blog about it.

    Summary

    While I may have gotten very frustrated using the Custom Script Extension resource type in Bicep recently, I still think it’s a very useful feature that could save you a lot of manual work in setting up a virtual machine, if you can get it to run the way you need. If you’ve also run into issues with this resource type, I would love to hear about them in the comments below! Otherwise, I will keep you all updated if I find a resolution to my specific issue after posting this.

    Related Posts

    An Introduction to Bicep

    I am starting a multi-week series of posts covering the topic of Bicep, Microsoft’s Infrastructure as Code scripting language, since I have been working with this tool heavily for the past several months. You may wonder why a database developer is using Infrastructure as Code, since DevOps engineers typically do that. However, I decided to learn Bicep as a proof of concept for my team since we repeatedly find ourselves creating the same types of resources for our projects. We would like to reduce the amount of time and effort we put into creating our infrastructure so we can spend more time developing databases and pipelines to get our users the things they care about.

    Stay tuned in the coming weeks for more details on how I have started integrating Bicep templates into my database development lifecycle.

    What’s in this post

    What is Bicep?

    Created using icon from iconmas on FlatIcon.com

    Bicep is a declarative scripting language created by Microsoft as a successor to and addition on top of their existing Azure Resource Manager (ARM) templates. (Yes, the new templating script language is called Bicep because the original was called ARM.) Bicep is a scripting language that allows you to easily and quickly write declarative scripts to define how you want Azure resources to be created. These scripts, called templates, are used for Infrastructure as Code, which means you are generating your infrastructure of cloud objects using code instead of provisioning them manually through a portal. The main competitor for Azure Bicep is Terraform, which is a multi-platform tool that essentially does the same thing. But Bicep is developed by Microsoft to work only with the Azure cloud, so it is better suited for deploying Azure resources than Terraform is.

    After you have created your Bicep script defining the resources you want to deploy to your Azure cloud environment, you deploy that template either manually or through a pipeline, and Azure gets to work creating the specified resources. In the background, as the deployment begins, the first thing Azure Resource Manager does is convert the Bicep script into the traditional JSON script used by ARM templates, so that it can understand and process your requests, since Bicep is essentially a nicer skin built for ARM. Once the template has been converted to JSON, Azure Resource Manager starts parsing all resource requests and sending those to the correct ARM APIs to create those resources.

    If your template is set up correctly, all your requested resources will deploy to your cloud environment, and the time it takes to deploy depends on the resources you are creating.

    Use cases for Bicep

    Bicep templates are great for any Azure cloud users who find themselves creating the same types of resources over and over again, or for anyone who wants to make their environment re-deployable. For example, if you need to recreate your development environment from production after each deployment cycle, you could write Bicep templates to do that work for you. Essentially, if you are doing resource deployment on a regular interval and you would like to stop doing all that work manually, to reduce human error or to make the process more efficient, Bicep may be the tool for you.

    What I use Bicep for

    I started using Bicep as a proof of concept for my team to determine if it would be suitable for our needs for creating Azure Infrastructure as a Service (IaaS) Windows servers which we install SQL Server on. As of the time I am writing this, I still haven’t figured out 100% how to generate those servers exactly as we need them through Bicep, but I’ve gotten a good way through writing a template to provision what we need. I have proven that Bicep will work for that, but there are some final technical needs, such as using a custom virtual machine (VM) image with SQL Server already installed with our required settings, which I haven’t quite figured out yet.

    Another use case that I have completed a proof of concept for is to generate all required minimum resources for our Azure-based AI chatbots with one template, which has already been immensely useful for me and my team. Creating fully cloud-based resources with Bicep has proven to be much easier to accomplish than creating the IaaS servers I started with, because all specifications and settings for those resources are managed by Azure, so Bicep handles them easily. It took me only a single afternoon to write my AI chatbot backend Bicep template, whereas it took me over a week to get my IaaS Bicep template partially working.

    This new AI backend template creates an Azure SQL Server, Azure SQL Database, firewall rules, key vault, key vault role assignment, storage account, auditing on the SQL Server, and a Search Service for managing the document indexes used by the bots we’re developing for our business users. When I was doing this all manually, it would take me about an hour to work through the portal and deploy each of these resources. But now that I’ve turned it all into a Bicep template, it takes only a few minutes to deploy it all.

    My thoughts on Bicep

    Created using icon from Freepik on FlatIcon.com

    Overall, I think Bicep is a cool and useful tool to use to create your Azure resources using a single script, which can be deployed over and over again. It really speeds up the creation of resources and reduces the chance of manual error while clicking through the portal setup of the same objects. There are several flaws with how it operates, such as not always understanding the hierarchy and order of deployment required for related resources, like it claims to do easily, but that isn’t an insurmountable challenge.

    As with most scripting or programming languages, I wish there was more and better documentation from Microsoft to help us newbies figure things out faster, but it’s not impossible to learn. I would recommend that anyone who is tired of manually deploying the same or similar Azure resources repeatedly look into using Bicep to speed up their infrastructure work.

    Best resources for learning Bicep

    I learned most of what I know about Bicep from a Microsoft Learn workshop that is free online. The first learning path I worked through was the Fundamentals of Bicep, which I would recommend for any developer who is new to Bicep. After I completed that one to get a general understanding of writing Bicep, I then completed a few separate modules from the intermediate and advanced Bicep learning paths. Then there were many things I just had to learn on my own through other resources outside of Microsoft or through my own trial and error.

    Summary

    Bicep is a great tool that all types of developers working in the Azure cloud can likely use to speed up their development lifecycles. If you’re interested in learning more about Bicep, stay tuned for the next few weeks as I take a deeper dive into how Bicep works using detailed real-world examples.

    Related Posts

    A Fun SQL Function — QUOTENAME()

    I have a short post today because I’ve gotten a little behind on my editing and want to get something out today. This week, I learned about the most useful little string function available in SQL Server that I wish I had known about previously. The function is QUOTENAME()which isn’t the most intuitive name for a function.

    What’s in this post

    Overview of QUOTENAME()

    What the function QUOTENAME() does is put delimiters around a string, which is especially useful for certain columns in your queries. Where this comes in handy are scenarios where you are going to use the outputs of your SQL queries in other future queries, so having the strings delimited to meet SQL Server standards is essential.

    For example, say you want to generate a list of the tables in your database and then you are going to use that list of table names in another query, so you would like the table names to be surrounded with square brackets–a SQL Server best practice. Instead of manually adding the brackets yourself, which I have done too many times to count, you can use the QUOTENAME() function that will do it for you. This function will also let you specify other delimiters, like parentheses, if you would like that instead. There are other delimiters you are allowed to specify, more than I was expecting, so check out the W3Schools page on the function for more details.

    Demo of the function

    Here is a demonstration of the function being used on the [name] column of the sys.tables table.

    SELECT [name]
    	,QUOTENAME([name]) AS DelimitedName
    	,QUOTENAME([name], '()') AS ParenthesesName
    FROM sys.tables;
    Screenshot showing a SQL query using SSMS with the default settings and with specifying parentheses as the delimiter.
    This screenshot shows two uses of the QUOTENAME() function: one with the default setting of square brackets and one specifying the parentheses.

    In the example above, I have displayed the normal [name] column from the table and then used the same QUOTENAME() function two different ways. The first usage of the function only inputs the column that I want to be delimited, which then gets the default delimiter of square brackets. If you, however, would like to use a different delimiter, like parentheses, you can specify that as the second parameter to the function. In my testing, you can specify QUOTENAME(column, '('), QUOTENAME(column, ')'), or QUOTENAME(column, '()') and they all give you the same parentheses delimiting.

    Summary

    If you often find yourself manually putting bracket delimiters around values you retrieve from SQL Server with queries, please consider using the QUOTENAME() function in the future instead to save yourself time and manual effort. I know I will for sure be using this now.

    Related Posts

    Moving Data Between Azure SQL Databases–the Fast Way

    I have done more database creation in the past couple weeks than I had done in the previous year combined, all because my team is working on making several different AI chat bots for our business users and we are being very agile about our development process. At first we had separate Azure SQL Databases for each chat bot to use for its backend, but then I decided it would be more prudent, cost effective, and easier to manage if we combined the three related bots into a single database. This meant that I needed to move data from three disparate databases into one single database, and I wanted to do it easily without any extra tools.

    How I decided to move the data between the databases was to quickly export to a flat file then import that flat file onto the new database–since I only had a few hundred rows to move from each source database. As far as I am aware, this should work with all modern types of SQL Server (on-prem, IaaS, Azure SQL Database). Read on to learn how to follow the same process I did.

    What’s in this post

    Exporting the data from the source

    On your source database in SSMS, run a SELECT query to retrieve all the data you want to copy over to your target database. Once you have your result set, select all records, right-click on them, then choose “Save Results As…” from the menu.

    Screenshot showing the "Save Results As" option when right-clicking on the results of a query in SSMS
    You can export your query results using “Save Results As…” from the right-click dialog

    That will open the normal “Save As” dialog, and you can choose where to save your results. Save the file to your computer.

    Once the file has been saved, I recommend opening the CSV using Notepad++ or similar tool to review the formatting of the data. There were a few times when I went through this process that the columns names did not get exported to the flat file, and instead the columns were generically named “Column1”, “Column2”, and so on (see final screenshot below for what this would look like when trying to import). Chances are that you won’t run into this issue, but it doesn’t hurt to check. If for some reason your file doesn’t have the correct column names that you expect, simply export the query results to a file again and it should fix the issue.

    Importing the data to the target

    To import the data onto your target database, connect to it in SSMS as well, right-click on the database name in the Object Explorer, then choose “Tasks” > “Import Flat File”. This will open a wizard to help you import the data.

    Screenshot showing where to navigate to in the database menu to import a flat file
    Import a flat file using the Task “Import Flat File”, which will open a wizard tool for you

    In the dialog that opens, browse to the file that you previously exported from the other database. Once you select that file, a table name is generated that the data will be imported to. You can change the name of that table if you wish. Also, choose the schema that table will go into. This process creates a new table from the imported data, so make sure you don’t use an existing table name.

    Screenshot of the Import Flat File wizard where you specify the file you want to import and the table you want to load the data to
    Specify the CSV file you created during the export process and the name of the new table you want to load the data into

    Click “Next” to move to the next page, where you can preview the data that will be imported. If the data looks good to you, click “Next” to move to the “Modify Columns” page where you can change some settings on the table columns if you would like. I chose to “Allow Nulls” for all columns, just to make things easier for myself in this import process. I plan on deleting this table after I copy the data from it into the final target table, so it doesn’t matter to me if every column allows nulls.

    Screenshot of the Import Flat File wizard showing the screen where you can modify the columns of the new table you are going to create during import
    This screenshot shows you what can happen if the export process goes a little wonky–the column names from the source may not be put into the CSV file and thus only show as generic names in the Import wizard. From this menu, though, you can modify the properties of the columns in the new table that is going to be created for the import.

    Keep clicking “Next” to move through the rest of the pages, review the process before you opt to import, then import the data. Once the import wizard has completed its work, you will now have a new table in your database, essentially a staging table.

    To get the data from your staging table into the final target table in your database, you can use an INSERT INTO tablename SELECT... query. This step also allows you to modify the data before moving it into the target table if you need to. In my case, I needed to add an extra column to the data that the source table didn’t have but the target requires, so I hard-coded the value for that column into my SELECT statement so it would be added to the final table.

    Cleanup

    After you have moved your data into its final destination, make sure to clean up the temporary new table that you created during the import wizard process, to keep things as neat as possible in your database.

    Summary

    Using this method to copy data between two Azure SQL Databases may be considered a bit lazy, but I truly think it’s the most efficient way to get data from point A to point B. Sure, I could have made a pipeline or used some other system to move the data around, but I only had a few hundred records to copy over so this method worked great for me. Let me know if this method saved you time as well!

    Related Posts

    Azure Storage Explorer for Moving Files

    Once again I am sharing a quick tip that will potentially speed up your work process, this time by using Azure Storage Explorer for moving database backup files between servers. This process saved me so much time this past weekend while completing a server migration, which is why I wanted to cover it with a quick post.

    What’s in this post

    What is Azure Storage Explorer?

    Azure Storage Explorer is a desktop application that helps you manage your Azure Blob Storage and Azure Data Lake storage. Before this past weekend, the only purpose that I had used it for was to propagate Access Control Lists (ACLs) for storage accounts, which is also a very helpful task it can accomplish. However, what I want to focus on for this post is how you can also use this tool to very quickly transfer files between servers, as long as those servers have access to the internet.

    Moving files with Azure Storage Explorer

    If you are ever in a situation where you are migrating a database from one server to another using the backup/restore method, and that backup file is very large even after being compressed, you may want to try this method of moving the file between your servers. This of course only works if you use Azure as your cloud provider.

    With previous database migrations, I tried different methods of moving my file between the servers, and even have an old post discussing this. But this past weekend, I was doing the production migration for a database and the file was larger than previous ones I had moved between servers, even after creating the backup in a compressed format. My first transfer method that I tried was to drop the file onto a network drive and then copy it to the destination server from that share drive on the destination server. While that worked, it was pretty slow, taking about 30 minutes to complete. That would have been… fine… if I hadn’t run into issues with the backup which forced me to generate a new backup file that needed to be copied over as well. Since I didn’t want to force the rest of the upgrade team to wait for that again, I started trying a new method.

    While that slow copy was in progress, I quickly download Azure Storage Explorer (ASE) onto the source server and uploaded my backup file to a storage account in our Azure subscription. And to my surprise, the upload of the 15 GB file took just about a minute or two if I recall correctly. No matter what specific amount of time it took, using ASE was significantly faster, and it didn’t cause a browser memory issue like when I attempted to upload the same file to the same storage account manually through the Azure portal. Because for some reason, even though I got the manual upload to a storage account to work in the past, I have now been having issues with my browser, both Chrome and Firefox, crashing out part way through the upload. So this new, faster transfer method is a huge win for me!

    Then I also quickly downloaded and installed ASE onto the target server, and the download of the file from the storage account through the application was just as fast as the upload was. I had my database backup copied over to the new server in the same amount of time that the progress of the network drive copy only reached 20% done. So I gratefully cancelled that copy process. I was happy about the speed of ASE and I am sure my colleagues were happy they didn’t have to wait on the database part of the upgrade even longer.

    Why is this so much faster?

    Given how much faster the upload and download for my file went using Azure Storage Explorer compared to every other method, I really want to know how it manages to achieve that. Unfortunately, it seems that the information about why and how it manages to be so fast is limited. Part of the speed obviously came from our own network speed, but some of it certainly comes from something special with the tool since trying to upload manually through the portal proved to be super slow, and would then crash out in the browser.

    From the few resources I’ve found online (listed in the References section below), it seems that the performance of ASE comes from how it uses the azcopy tool to speed up and also parallelize the file transfers and use multiple cores from the host machine to do so. Whatever makes it work so quickly, I am very happy that it exists and will likely be using this method of copying files between servers going forward. Downloading and installing ASE, then using it to upload and download files, is still much faster than trying to move big files any other way.

    Summary

    If you need to move a large file between two servers in your Azure cloud environment, I highly recommend using Azure Storage Explorer to drop that file onto a storage account, which will complete very quickly as long as your internet speed is normal, and then download that file using ASE as well, which will be equally as fast. There are other methods of copying files between servers that I’ve discussed before, but this is now going to be my preferred method.

    Resources

    • https://stackoverflow.com/questions/57729003/blob-code-download-much-slower-than-ms-azure-storage-explorer
    • https://azure.microsoft.com/en-us/blog/azcopy-support-in-azure-storage-explorer-now-available-in-public-preview/
    • https://learn.microsoft.com/en-us/azure/storage/storage-explorer/vs-azure-tools-storage-manage-with-storage-explorer?tabs=windows
    • https://azure.microsoft.com/en-us/products/storage/storage-explorer

    SQL Server 2022 – Getting the Installer Link to Work

    I recently went through a whole day long ordeal trying to figure out how to get the ISO/installer for SQL Server 2022 so I could install it on a few VMs I made for a software upgrade, which I wrote about in a post last week. After spending way too much time just getting the installer downloaded, I then ran into another issue where when I clicked on the link to do a new installation of SQL Server 2022, the link did absolutely nothing. The solution for this problem ended up being simple, but it had me scratching my head and bothering my teammates to try running the installer as well before I figured out what needed to be done.

    What’s in this post

    The Problem

    When you run the setup.exe or installer for SQL Server 2022, and select the option for “New SQL Server standalone installation or add features to an existing installation”, the software may do nothing when you click that link. It may have a spinning cursor for a small moment, but then nothing happens at all, no matter how long you wait.

    The Solution

    I found the answer to this issue on StackExchange. When I first downloaded and ran the installer, the standard Windows dialog came up asking if I wanted to trust the publisher of the program, since it was Unknown. I clicked yes. But despite that apparently, my computer had the file set to blocked, so when I clicked the link to start a new installation, the installer was blocked and nothing ran. To fix this problem, do the following.

    Go to wherever you have the ISO file saved, for me this was the Downloads folder. Right-click on the file and select Properties. Then on that General/main page of that dialog, at the bottom there will be a checkbox that says “Unblock” with a message about security if you unblock the file. Check the box then click “OK”.

    If you were experiencing the same issue that I was, you should now be able to reopen the installer file and click the link to start a new installation, and move on with your day.

    Summary

    There seems to be a bug with the installer for SQL Server 2022 where clicking the option to start a new installation seems to do nothing. This bug can be resolved by “Unblocking” the installer file through the Properties dialog of the file.

    How to Download SQL Server 2022

    You would think that figuring out how to download the installer for SQL Server 2022 Standard or Enterprise editions would be a simple task, much like the process for how you download any other installer or ISO for any other version of SQL Server. But with 2022, Microsoft seems to have changed their method for getting the software to people, making it a lot more locked down than other versions are. I’m not sure why this change was made, and if you have some insight please feel free to share it in a comment, but it was made and I have finally figured out how to get that installer with this new system. It took my colleagues and I a whole work day to figure this out. We had to work with our Microsoft account managers to get to the bottom of it, and it seems a little crazy to me that it would be that unintuitive to find the ISO for this software.

    What’s in this post

    Background

    In my company’s current architecture, we create SQL Server virtual machines (VMs) by creating standard Windows servers through the Azure portal, and then one of the database developers will logon to the VM and install SQL Server through the ISO/installer, instead of creating a SQL Server VM from a provided image in Azure. In the past, when we needed to install SQL Server onto a VM, we would google something like “SQL Server 2019 download” and click on the link that takes you to the Microsoft downloads page, and then get the ISO from there. When we tried to do the same for SQL Server 2022, we were taken to a downloads page that only allowed you to download Developer or Express editions, and if we wanted to do Standard or Enterprise, we could only get an evaluation version after providing our information to start a free trial.

    We obviously did not want a short trial version of the software, we wanted the full Standard edition, but could not find a full installer/ISO for that version. For a few months, we gave up and installed 2019 instead of 2022 whenever we needed to create a new server, but we reached a point where we are required to use 2022 for some applications, so I set out to find out how to get the Standard edition of SQL Server.

    The Proper Method

    There are two different ways that we found we could get a Standard or Enterprise edition of SQL Server 2022, and I’ll start with the method that I think they’re intending people to use. If your organization uses SQL Server in Azure already, you will likely need to go through this same method to get the installer for 2022.

    I unfortunately can’t get any screenshots of this method of the process because I’m not an elevated user in our organization, but there is a way for your Azure tenant owner (and a few other select individuals) to get the latest ISOs and installers for every piece of Microsoft software your company has licenses for. I believe it is at admin.microsoft.com, but you can talk to your account representative to find out for sure. You can then navigate to “Billing” and then “Your Products” and then “Volume Licensing” where you can view all your licenses and get the downloads for them.

    This is how we resolved the installer issue in our organization; we had our tenant owner download the ISO from this location and save the file in a shared location. Going forward, whenever we need an installer for any Microsoft product that we use, we will be required to have that one person download the files and send it to us (not a super convenient method if you’re a developer that doesn’t have full access to your entire Azure tenant).

    Once you have the ISO, you can run it as you usually would to install any version of SQL Server, and your license key will be prefilled into the installer so you don’t need to worry about specifying that at any time during the installation process.

    The Workaround Method

    Before we had talked to our Microsoft account managers, this is the route we were going to use to be able to install SQL Server 2022 on our VMs. Go to the SQL Server 2022 page on the Microsoft website (https://www.microsoft.com/en-us/sql-server/sql-server-downloads) then download the “Developer Edition”.

    After you download that version, run the installer that you downloaded. When it first opens, click the option that says “Download Media”:

    On the next page, click “Download”, which will download the ISO for SQL Server:

    After the ISO file has downloaded, double-click on the file in File Explorer, which will mount a “DVD” to your system, which will open another directory. In that directory, double-click on “setup.exe” to run the installer for SQL Server.

    When the installer window opens, click on the “Installation” page from the left-hand pane, then click on the first link that says “New SQL Server standalone installation or add features to an existing installation”.

    If you have an issue where clicking that link doesn’t open anything else, seems to do nothing, check out my other post about how to resolve that issue.

    Click “Next” multiple times to move through the initial installation steps for SQL Server 2022. Review the information on each page before moving past it to ensure the settings are what you want them to be. When you come to the following page, keep the radio button for “Perform New Installation” selected then click “Next” again.

    The “Edition” page is the one we’ve been needing to get to. On this page, you can change the selection from “Specify a free edition” to either one of the other two options to install Standard or Enterprise editions.

    If you select the option for “Use Pay-As-You-Go billing through Microsoft Azure”, there is a dropdown menu that will let you specify Standard or Enterprise editions. Note that going with this option would then later require you to specify Azure Subscription information to bill the server usage to.

    If you select the option for “Enter the product key”, you can then give a license key for SQL Server 2022 if you already have one. Review the two optional checkboxes under the license key box to select any options that apply to you. It looks like if you don’t specify a license key after choosing this option, you can still move forward with the installation, but there must be some point in the future of using the software where you’ll be required to provide a license key to keep using it.

    Note: If you go through the “proper method” I listed above, the license key should be prefilled in the third option when you get to this page with the properly downloaded ISO file. For our situation, we also need to make sure the checkbox below the license key box “I have a SQL Server license with Software Assurance or SQL Software Subscription” is checked, since we use Software Assurance for our licensing.

    After you’ve made your selections for the “Edition” page of the installer, you are ready to move through the rest of the installation process, specifying whatever options required for your own version of SQL Server.

    Summary

    There are two different methods I have found for getting the installer for SQL Server 2022 Standard edition, and neither of them are necessarily intuitive. Based on your needs, you can choose either the “proper” or the “workaround” methods. I hope this tutorial has helped you download the software without having to spend a day figuring it out like my coworkers and I had to.

    Oracle Cloud World 2024- Day 1 In Review

    Today was my first day at Oracle Cloud World in Las Vegas, and also my first time ever at an Oracle conference, since I only recently started doing database administration with this RDBMS. As with any technology conference, the day was jam-packed with many different sessions. Although I obviously cannot convey all the information I learned at my sessions throughout the day, I will try to summarize the interesting and key points I took away from each.

    What’s in this post

    Keynote Session 1: “Customers Winning with the Cloud and AI”

    I’m not going to lie, I went into this conference and this sessions not having a great opinion of Oracle, due to the multiple negative experiences I had with their database platform in the past few months. However, this keynote was not a useless session, because I did learn that MGM Resorts owns a huge number of properties and hotels in Las Vegas, which is interesting from a big data perspective, and that the CIA is the first customer of Oracle. There are a lot of rumors online about how Oracle came to be and how it may or may not relate to a CIA project codename, but I couldn’t find any reputable sources, so we’ll just leave it at the CIA being the first and one of the largest customers of Oracle.

    Besides those interesting tidbits, this keynote mainly contained somewhat dry interviews with different large customers of Oracle talking about how they’ve utilized Oracle products to revolutionize their businesses and how they’ve started to use AI in their technology journeys. This was the beginning of discussions surrounding “AI” that continued throughout most of the sessions today. (On that topic, I’m starting to feel like the term “AI” is being watered down or misused at this conference to represent things that really shouldn’t fall under that term…)

    “Accelerate Your IAM Modernization to Multi-cloud Deployments”

    This session was not what I expected it was going to be, and it wasn’t the only one where that happened today. However, even though the content of this session wasn’t what I thought it was going to be, I did learn a few interesting things. The presenters gave many startling facts about the costs associated with data breaches as well as the causes of those breaches. The statistic that I found most interesting is their claim that 60% of breaches resulted from poor patch management.

    I was hoping that this presentation was going to cover more of the technical details of implementing and modernizing IAM within the Oracle ecosystem, but it proved to be a general overview of what everyone should be doing instead, which was what disappointed me about it. However, it at least gave me some topics that I can do further research on by myself to learn more about IAM in Oracle, such as the Oracle Access Manager. At least if I couldn’t get the technical details I wanted, I still got some direction about what to research next.

    “Create a Data Pipeline with Data Transforms in Autonomous Database Data Studio”

    This presentation was more the style of what I was expecting of most of the sessions I chose today, but it once again covered completely different information than what I thought it would. This session was different than what I expected because I was not aware that “Data Transforms” is an Oracle product; I thought that it was being used as a general term in the title. If I had known that the presentation would be covering a specific service, I would have had a better understanding of what I was about to learn. My confusion of expectations did not make the presentation unenjoyable or uninformative, though.

    What I learned from this session is that Oracle has two different ETL platforms available to move data around, similar to how Microsoft has SSIS for SQL Server (and other databases). Data Transforms was the service covered by this session, but they did mention Oracle Data Integrator (ODI) which is another, older ETL service. Data Transforms can move data between tons of different types of databases, not just Oracle, and it seemed to have a lot of interesting and easy to use ETL capabilities. It seems like they are trying to make this tool be the data flow tool of the future, especially since they covered 3 different features that are about to be added, like vector search/query capabilities. Although I haven’t had the chance to use this tool myself, I want to temper the expectations for what it can accomplish due to my own personal experiences with other Oracle services. Maybe it’s as fantastic and as useful as they say it is, or maybe it’s just another sales pitch that is better than the real user experience. If you have personal experience, good or bad, with Data Transforms I would love to hear about it in the comments below.

    Keynote Session 2: “Oracle Vision and Strategy”

    Of all the sessions today, I think this one had the most interesting pieces of information, although none of it was directly applicable to me or my company. The biggest downside of this keynote was that it went way over time, so I had to head out before the end of it. The two major topics of the keynote, presented by Larry Ellison, the founder and CTO of Oracle, were the joining of Oracle Cloud with the other major cloud providers and then covering various different topics surrounding artificial intelligence and how they want to use it to fix all the problems of the application and database development world.

    While I like the idea of them putting Oracle databases into the other cloud platforms–Google Cloud Platform, AWS, and Azure–because it gives me hope that maybe one day we could migrate our Oracle databases to a less fragile ecosystem, it did leave me wondering if one day in the future there will just be one single mega-cloud system and monopoly originating from the combination of the current big four (but maybe I’ve just been reading too many dystopian novels lately).

    I thought the second part of the keynote, surrounding current and potential uses of AI integration with other software systems, was more interesting and also a bit scary. Interesting in that automating mundane and error-prone processes makes our lives as database developers and administrators easier. There are so many things Mr. Ellison mentioned automating with AI that sounded great and useful to me. But it also scared me a bit, as it felt like there was an undertone of invasiveness being discussed under the guise of security. Security, on the technological and physical level, is important for individuals, groups, and even our whole country, but I personally believe security should not come at the cost of personal freedom and privacy. Some of the proposed and planned uses of AI, specifically how it relates to biometric authentication for every aspect of our lives, left me feeling a little uneasy (but once again, maybe it’s due to the large number of dystopian books I’ve read lately).

    “Access Governance: The Key to Ensuring the Survival of Our Digital Lives”

    I think this session was the best presentation of the day, as far as straight communication abilities go. The team of presenters was very well put together and knew their topics well without having to read off their slides at all, and I really appreciated that.

    The topic of this session was once again about managing who can access what, including the use of Identity and Access Management (IAM) as one of the core topics. The presentation was lead by a member of the Oracle leadership team, who was accompanied by three Oracle customers, including a senior security engineer from Uber. Hearing from the different customers about their experience using IAM in general, not just the Oracle services, offered a great perspective on managing access to applications and databases, and gave me some ideas to take back to my own work. The main Oracle services covered were Oracle Access Governance and the Intelligent Access Dashboard, which I’ll need to do further research on myself now.

    “AI-Based Autoscaling with Avesha for Simplified OKE Management on OCI”

    This was my last session of the day, and although I was tired and dreaming of my hotel room, I did find it to be another interesting presentation, although not super applicable to my work life. The title is quite a mouthful, but what it covered was how a small company called Avesha has created 4 different tools to help you autoscale and manage Kubernetes clusters in Oracle Cloud Infrastructure (OCI). Their 4 tools all seemed like they would be very useful for people who are working with Kubernetes in Oracle, since apparently the autoscaling in OCI doesn’t always work as well as people want it to (coming from comments from the audience during the Q&A at the end of the presentation).

    While I don’t think my company will be using any of Avesha’s tools anytime soon, they did seem like they could be extremely useful to other organizations. And the presenters definitely understood their own products, down to the fine details of how they work, which is always a green flag I appreciate with software vendors.

    Summary

    Wooh, that was a lot of information to recap and cover for a blog post! After attending these six sessions on this first day of Oracle Cloud World, I’m a little bit overwhelmed an exhausted, and not quite ready for another day and a half of more info dumps. But that’s okay because it’s what’s to be expected from conferences like this. I am hoping that the sessions I picked for tomorrow are more applicable to my current role, but even if they aren’t, I’m sure I’ll learn more interesting things throughout the day.