Power Automate Desktop: Read & Filter Excel With SQL Query
In Power Automate Desktop there are two ways to read an Excel file. The traditional way is to use built-in Excel actions extract all of the spreadsheet’s data. A better way is to create a SQL connection to the Excel file and use SQL queries to get the data we need. SQL queries are twice as fast at reading data from an Excel file and we can write them to filter the data and only return a subset of what we need.
In this article I will show you how to get data from an Excel file using both methods and how to filter Excel data with a SQL query.
Table Of Contents:
• Introduction: The Employees List Excel File
• Create An Employees List Excel File
• Read Excel File Data Using A SQL Connection
• Read Excel File Data Using Excel Actions
• Testing The Speed of SQL Connection vs. Excel Actions To Read An Excel File
• Selecting Specific Excel File Rows With SQL Statements
• Filter An Excel File Where A Column Is Equals A Specific Value
• Filter An Excel File For Values Greater Than A Specific Number
• Filter An Excel File For Values Between Two Dates
• Filter An Excel File To Return Top Rows
Introduction: The Employees List Excel File
The human resources department of a financial services firm has a list of employees and other important details stored in an Excel spreadsheet. They want to use Power Automate Desktop to extract information from the Excel file and input it into their payroll system.
Create An Employees List Excel File
For this tutorial we need an Excel spreadsheet with employee information. Open Excel and type in some sample data as shown in the screenshot above. When saving the spreadsheet make a note of the filename and folder path where it is stored. The path to my spreadsheet is:
C:\RPA\ReadExcelUsingSQL\EmployeesList.xlsx
I have provided a copy of the Excel spreadsheet I used in my Github repository.
Read Excel File Data Using A SQL Connection
There are two ways to open an Excel file in Power Automate Desktop. First we will open the Excel file and read its data using a SQL connection. Then we will open it the traditional way using Excel actions and read the data. After trying both methods we will compare them to one another.
Open Power Automate Desktop and create a new desktop flow.
Set a variable with a path to the Employees List Excel file we created earlier.
Then insert an Open SQL Connection action.
Then write this code in the connection string field. A connection string specifies information about a datasource and tells Power Automate how to connect to it. Notice that the Source parameter includes a reference to Excel_File_Path variable.
Provider=Microsoft.ACE.OLEDB.12.0;Data Source=%Excel_File_Path%;Extended Properties="Excel 12.0 Xml;HDR=YES";
Now we will write a SQL query to extract data from the Excel spreadsheet. Create an Execute SQL statement step as shown below.
Then write this code in the SQL statement field. This query says to get all of the columns in from the spreadsheet named Sheet1.
SELECT *
FROM [Sheet1$]
We must add one more action to close the SQL connection once the query is completed.
The full set of actions to open the Excel file with a SQL connection should look like this.
When we run the desktop flow it returns a variable called QueryResult with the full contents of the Excel spreadsheet.
Read Excel File Data Using Excel Actions
The normal way to read an Excel file is by using the Excel actions in Power Automate Desktop. Let’s give this method a try so we can see compare it the SQL connection option.
Insert a new Launch Excel desktop flow action.
Choose to launch Excel and open the following document at the document path stored in our variable.
%Excel_File_Path%
Then add a Read from Excel worksheet action. Use the Excel instance created by the Launch Excel action and retrieve all available values from the worksheet.
Finally, close the Excel file.
The completed desktop flow should look like this.
Run the flow and then look at the QueryResult variable. It contains exactly the same values.
Testing The Speed of SQL Connection vs. Excel Actions To Read An Excel File
Both methods of reading an Excel file return the exact same values. So why is it better to use a SQL Connection as opposed the Excel actions. One advantage is speed. A SQL Connection can read data from an Excel file twice as fast.
The following desktop flow performs both options and uses the get current date and time action to track when they started and ended. The subtract dates action determines how many seconds are between the start and end times and then the result is displayed in a message box.
When we run the desktop flow we can see the SQL connection is twice as faster at reading an Excel file as the traditional Excel actions.
Selecting Specific Excel File Rows With SQL Statements
An even greater advantage of using a SQL connection to read data from an Excel file is we can use SQL statements to obtain a filtered set of rows. If you want to learn how to write SQL queries I suggest using the awesome SQL reference guide on the W3 Schools website. I’ll also give a few examples to get us started.
Filter An Excel File Where A Column Is Equals A Specific Value
Suppose we wanted to only wanted to extract the employees who live in the US state of New York (abbreviation: NY).
Write this query in the SQL statement field. Notice that we’ve added a WHERE clause to our SQL statement. The WHERE clause allows us to test each row for a specified condition. In this example the us_state column must equal the text NY.
SELECT *
FROM [Sheet1$]
WHERE [us_state] = 'NY'
After the desktop flow is run we check the QueryResult variable and see it shows only employees from New York (NY).
Filter An Excel File For Values Greater Than A Specific Number
Next, let’s try to only get the employees whose annual salary is over $80,000.
Write this query in the SQL statement field. Notice that we use the greater than symbol to compare the annual_salary field to a number. We also use the ORDER BY clause to sort the query results from the highest salary to the lowest salary.
SELECT *
FROM [Sheet1$]
WHERE [annual_salary] > 80000
ORDER BY annual_salary desc
When we run the desktop flow our QueryResult variable looks like this.
Filter An Excel File For Values Between Two Dates
This example is one I have come back to many times. We can use a SQL query to return only the results between two dates. In this one I want to get only the employees hired during the year 2020.
Write this code in the SQL statement field. We can use the BETWEEN operator to specific two dates for comparison. Notice how dates must be written within two # symbols.
SELECT *
FROM [Sheet1$]
WHERE [hire_date] BETWEEN #1/1/2020# AND #12/31/2020#
Once again, run the desktop flow and open the QueryResult variable and it will only show employees with hire dates in the year 2020.
Filter An Excel File To Return Top Rows
This final examples shows how to get only the top 3 rows in an Excel spreadsheet.
Write this code in the SQL statement field. The SELECT TOP clause can be changed to increase or decrease the number of rows returned.
SELECT TOP 3 *
FROM [Sheet1$]
Run the desktop flow and check the QueryResult variable to confirm only 3 values are returned.
Did You Enjoy This Article? 😺
Subscribe to get new Power Automate For Desktop articles sent to your inbox each week for FREE
Questions?
If you have any questions about Power Automate Desktop: Read & Filter Excel With SQL Query please leave a message in the comments section below. You can post using your email address and are not required to create an account to join the discussion.
This is great @Matthew, thank you!
Is there a similar solution for cloud flows using Excel files on Sharepoin or OneDrive?
Nuno,
In Power Automate cloud flows you must use the get rows from a table action (Excel) and then use a filter action (Data Operations) to retrieve a subset of rows. You can achieve the same goal but not by using SQL.
Here’s a quick example I found:
https://powerusers.microsoft.com/t5/Building-Flows/How-to-filter-data-from-Excel-rows/td-p/187827
Thank you for sharing this amazing post. I just started learning PDA and it’s a really a very nice tool.
Will it be possible to read a specific worksheet from the excel file dynamically?
Example: If you have 10 sheets within the excel file the process should find the sheet before the last one, copy the data found in this sheet, create a new worksheet, and paste the data.
Best regards,
Julien
Julien,
You’d have to use Excel actions to find the sheet before last one:
* Launch Excel
* Get all worksheets
* Set variable varSecondLastSheet = %SheetNames.Count-1%
* Set active Excel worksheet, based on the Index, %varSecondLastSheet%
There’s a headstart for you. I think you’ll find the rest easy to do.
Thank you so much for your reply. It worked perfectly.
Dear Matthew,
thank you for sharing.
Have you been able to insert or copy the data table produced from Excel into a SQL database?
I did both ways to connect to the xls:
▫️by SQL connection and
▫️by reading xls
The resulting data table looks good/correct.
However, when using it to insert or to create a table in a SQLite database, it looks like PAD cannot proceed showing an error message ([HY000] “not such table” followed by the data rows)
The connection to the SQLite works otherwise with other queries.
I tried 2 different ways to connect from PAD:
▫️by SQL connection and
▫️ by CMD line call passing the query as parameter
Maybe the way I refer to the data table is wrong?
▫️%xls_datatable_name%
▫️’%xls_datatable_name%’
▫️[%xls_datatable_name%]
Looking forward to your ideas …
Ed John,
Would you be able to share the SQL statement you wrote to insert Excel rows into the SQLite database. I’d like to know a little more about your approach.
Matthew,
I tried different queries/approaches:
▫️INSERT INTO ‘%target%’ SELECT * FROM ‘%excel_datatable%’
▫️or after dropping the target table:
CREATE TABLE ‘%target%’ AS SELECT * FROM ‘%excel_datatable%’
but I reckon it’s rather a wrong assumption of me how the query works:
as the excel data table is not part of the database the query will not find it.
I found this:
https://powerusers.microsoft.com/t5/Power-Automate-Desktop/Insert-data-table-into-sql-table/td-p/916417
Thanks again!
Ed,
Thanks for your patience. I plan to look into this on the upcoming weekend.
My theory is you must loop over the excel_datatable and INSERT INTO on the current record as opposed to the whole table at once. I’ve downloaded SQLITE and plan to test this theory on Saturday/Sunday.
Thank you, Matthew!
No need to waste your time if you are busy with other topics.
l am pretty sure insert the data table row by row will work.
Instead, the CSV import of the complete data table might be more efficient depending on the numbers of rows.
The CMD call of sqlite3 passing the CSV file from PAD works quite well.
Cheers!
Thank you for this @Matthew. It is awesome.
Using this and cloud flow, which is faster?
Oluwatobi,
I have not done any performance comparison because I do not view them as alternatives to one another. Do you have a decision to make where you must decide between cloud flow or RPA and the correct choice will be the one with better performance?
Great explanation. I am searching for ways to execute a sql query in database and load the result into dynamically created excel file in the sharepoint. Excel file name should be todays date. How to achieve this?
Reni,
You could use the get today’s date action to get the current date, then format the date as text in another action and use it as part of your Excel file name.
Mauricio,
Since this method relies on the MS Access engine I would suspect you must use the & operator to join columns:
https://support.microsoft.com/en-us/office/combine-text-values-by-using-an-expression-d1653b60-dd60-4a05-9747-6b31ba4c4eed
How to achieve a query just for required column just say fullname, id ,gender rest not required
Shital,
Try this:
SELECT fullname, id, gender
FROM [Sheet1$]
Hi Matt, I have one question. How can we join data from two separate files?
Mike,
I don’t have the answer for you. But I really, really want to find an answer. I think this is a good idea for a future blog post 🙂
Thank you Matthew you are a genius this is just what I needed for filtering my excel report
Anelisa,
You’re welcome. There is truly not enough good Power Automate Desktop content out there. I feel like I am always searching for the articles I end up writing.
How to add condition in power automate desktop flow if a date column in excel equals to Today.
Ansuree,
Get the current date in a different desktop flow action. Then use this variable output by that action in your SQL statement.
Hi @Matthew
I’m struggling with a date query on PAD. Not much support happening on the poweruser community 🙁
https://powerusers.microsoft.com/t5/Building-Flows/Power-Automate-Desktop-SQL-query-for-Excel/m-p/1809411/thread-id/199735
The date part of this query does not return results
If I remove the AND [Actual Work End]….. then the query returns results as expected.
Whats the syntax for filtering by date?
Matthew
Matthew,
You stole my name. Just kidding, lol.
I think you’ve got it correct. But is it possible that your Actual Work End date is a TEXT data type? That could be the error reason.
Hi Matthew,
Great
mindsnames think alike!It was a TEXT field.
I used this to make it work:
Hi Matthew,
Is there any way to retrieve data directly from inside a PC application such as an accounting software that works on an SQL database by filtering the recorded data inside of it related to your needs to an Excel file by just executing a command inside of Excel ?
I would really appreciate any guidance or suggestions that would propel me in the right way.
Thanks.
Toni,
I’d suggest looking at the “Get Data” button in Excel. If your data source is in this reference guide I’d say yes. I can’t make general comments about this. It really does require looking at the specific program.
https://support.microsoft.com/en-us/office/import-data-from-data-sources-power-query-be4330b3-5356-486c-a168-b68e9e616f5a
Hi Mathew,
Is there any way to compare two excels and fetching the data like Joins.
Maha,
You must do this by writing Power Automate Desktop code. It cannot be done using a SQL Query.
Hello Matthew,
Thanks for your reply. I was also trying to use join for comparing two sheets.
Do you have any example where two sheets are being compared based on 1/2 columns like SQL join does?
I am curious to know the approach before using Macros.
Regards
Ghridihar,
I do not have any examples like this.
SELECT * FROM [Sheet1$], [Sheet2$]
WHERE [Sheet1$].[Column1] = [Sheet2$].[Column1]
How can I change column format from “General” to “Text” using the above query