none
import specific columns from an Excel table RRS feed

  • Question

  • Hello,

    I have a large table in Excel with multiple columns. Is it possible to use M to import specific columns of the table rather than importing the whole lot then removing the unwanted columns?

    Thanks in advance,
    Mark

    Friday, July 14, 2017 7:13 AM

Answers

  • You can do so by importing the Excel table via an OLEDB connection and an SQL query. However, there is no certainty that you will get better performance by using this approach (usually works best when filtering a lot of rows prior to import).

    The steps:

    1) Get Data-->From Other Sources-->From OLEDB

    2) In The From OLE DB dialog box, enter the connection string:

    Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\MyExcelFile.xlsx;Extended Properties="Excel 12.0 Xml;HDR=YES";

    Substitute C:\MyExcelFile.xlsx with the full path of your Excel file.

    3) In the Advanced Options, enter the following query string:

    select [_].[MyColumn1],
             [_].[MyColumn2],
             [_].[MyColumn3],
             [_].[MyColumn4],
             [_].[MyColumn5]
    from [MyWorksheetName$] as [_]

    Substitute [MyColumn1], [MyColumn2], etc. with the actual column names that you want to keep.

    Substitute [MyWorksheetName$] with the name of the sheet containing the table (make sure to keep the '$' at the end of the sheet name).

    Sunday, July 16, 2017 8:36 PM

All replies

  • You can do so by importing the Excel table via an OLEDB connection and an SQL query. However, there is no certainty that you will get better performance by using this approach (usually works best when filtering a lot of rows prior to import).

    The steps:

    1) Get Data-->From Other Sources-->From OLEDB

    2) In The From OLE DB dialog box, enter the connection string:

    Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\MyExcelFile.xlsx;Extended Properties="Excel 12.0 Xml;HDR=YES";

    Substitute C:\MyExcelFile.xlsx with the full path of your Excel file.

    3) In the Advanced Options, enter the following query string:

    select [_].[MyColumn1],
             [_].[MyColumn2],
             [_].[MyColumn3],
             [_].[MyColumn4],
             [_].[MyColumn5]
    from [MyWorksheetName$] as [_]

    Substitute [MyColumn1], [MyColumn2], etc. with the actual column names that you want to keep.

    Substitute [MyWorksheetName$] with the name of the sheet containing the table (make sure to keep the '$' at the end of the sheet name).

    Sunday, July 16, 2017 8:36 PM
  • In M it looks like this:

    let

        Source= Odbc.Query("dsn=Excel Files;dbq=C:\FOLDER\ExcelFile.xlsx;defaultdir=C:\FOLDER;driverid=1046;maxbuffersize=2048;pagetimeout=5", "SELECT Sheet.ColumnName1 FROM `C:\FOLDER\ExcelFile.xlsx`.`'SheetName$'`as Sheet")
    in
        Source

    Wednesday, June 10, 2020 8:26 AM