none
How to learn M as a language? Where's my User Guide and Reference Manual? RRS feed

  • Question

  • M is the language I've been looking for for 30 years. A desktop ETL and data shaping language with powerful text manipulation capabilities. Its syntax is a little awkward, but straight forward and approachable.  I use it every day.

    But why is it so hard to learn?  Because there is so very little documentation available for it.  What documentation there is split out onto hundreds if not thousands of tiny web pages that each document a tiny concept in isolation.  Most of the examples are more like syntax diagrams than usable code snippets.  It reminds me of the "documentation" compiled for a large in-house software project by dumping the function headers and sticking them into a .hlp file.

    I have spent time with the Power "Query Formula Language Specification" and the "Power Query Formula Library Specification" PDF files.  Bother suffer the same problem with their examples and they are full of didactic language targeted more to a computer science programming language developer than to an end user.

    I have bought both Chris Webb's and Ken Puls books on the subject and was disappointed to find that neither spends much of any time on the language, but mostly the neat tricks you can perform in the GUI.

    Some people here in this forum have obviously figured it out.  I search here for examples and often find them, but I stare at the syntax and wonder how in the world they figured it out?  Some of the functions and parameters used don't even appear in the documentation that I am aware of.  How did you learn it?  Please share...

    What you can do in the GUI is amazing and a lot of what I need can be done there. But, I know that there is so much more that I could do by augmenting the generated code with different patterns.  I have figured a lot out, but it takes a lot of guessing and trial and error.  Can I use an "each" here?  In this context, does a column name go in quotes or in [] or do I use an _?  How the heck do I specify a carriage return literal here - cr, (cr), #(cr), "#(cr)"?  Why is this function documented to use several parameters, but I see it in this working example in the forum with none?

    BTW, I hate that it seems to have been trivialized as it has gone from being "M" to the "Power Query Formula Language" to "that stuff in the Get & Transform ribbon tab" - it makes it that much harder to search for relevant information about it on the internet.

    So, where can I go to learn about this fabulous language, in a structured manner?  Where's my User Guide and Reference Manual?

    Dale

    Thursday, March 31, 2016 6:29 PM

Answers

  • Hi Dale,

    Thanks for the feedback! It's great to hear that our advanced users want to get more proficient with M.

    Have you seen the M Introduction on MSDN?

    The section on Values and Operators might help clear up {} vs []

    The section on Functions tries to explain what "each" does.


    I agree that the M language spec and library spec PDFs can be pretty technical, and sometimes they get out of date as we improve the library.

    The library spec PDF is probably the best place right now to learn about text literals. If section 2.6.1 isn't clear enough, let us know!

    We're actively working on improving the MSDN pages, but for right now the best place to find up-to-date library documentation is from within Editor. (If you want the latest and greatest, the current Power BI Desktop release is often a few months ahead of Power Query in Excel.)

    I'll work with the author of the M Introduction to implement your feedback.

    One other resource I didn't see mentioned was Stack Overflow (tag powerquery), which is a great place to get M code questions answered.

    • Marked as answer by Dale Hohm Friday, April 8, 2016 11:20 AM
    Thursday, April 7, 2016 10:15 PM

All replies

  • Have you tried Power Query Library?  Whilst not perfect, I generally find what I need when I look on there.

    Regards

    Laurence

    Friday, April 1, 2016 10:42 AM
  • Another option is to ask PowerQuery to displays its own built-in help: use the M code below to obtain a list of all M buil-in functions. When you select a cell in the value column, PowerQuery displays a little help about the corresponding.

    let
        List_Of_M_Functions = Record.ToTable(#shared)
    in
       List_Of_M_Functions
    I usually create this query in each new workbook so the help is always no more than a few clicks away.

    Friday, April 1, 2016 12:13 PM
  • Totally share your enthusiasm about M. Make it 20 years for me :-)

    When I first read the "Query Formula Language Specification" I didn't understand anything of it.

    Took me over a year to get my head around it (using M in my daily work, reading books and blogs and actively participating in forums as a training ground).

    I've put together my learning resources on my blog here: http://www.thebiccountant.com/learning-resources


    Imke Feldmann TheBIccountant.com


    Saturday, April 2, 2016 4:44 PM
    Moderator
  • Thanks Imke,

    That's what I feared the current answer is... just go try to digest the Formula Language Specification.  

    (Note that there is a little bit in the Introduction to Power Query Formula Language https://msdn.microsoft.com/en-us/library/mt270235.aspx, but it is way to brief)

    Also, thanks Bertrand and Laurence - I was aware of both of those sources.  They just don't fill the void.

    For one example, I've found Splitter.SplitByNothing used in some solutions in this forum (one provided by Imke too, I think).  You will not find that in the formula reference and in the "List of Functions" trick above, it gives a not very helpful description:

    function () as function

    Returns a function that does not splitting, returning its argument as a single element list. (typo NOT mine)

    That is not helpful to me.

    Where can I get an introduction to basic syntax for the various constructs so that I can easily write something like this with the correct () {} [] combinations?

    Grouped = Table.Group(ChangedType, {"Row"}, {{"Fact", each Text.Combine([Fact], ", "),type text}})

    Where can I readily learn about the syntax for literals like #(cr)?

    Where can I find examples of testing for different data types?

    I'll stop for now.

    My point is that I don't think that the language is really all that complex, but the basics that you typically find in an accessible form for nearly any other language are simply not available or are are not documented in a fashion that makes it accessible to a large percentage of the user base that could so easily use the language to greatly enhance their usage for Excel, Power Query, Power Pivot and Power BI.

    I hope the BI team takes note... and Chris and Ken - I would gladly buy a second edition of your books if you were to address this.

    Tuesday, April 5, 2016 11:28 PM
  • Sorry, cannot remember where I used it (you'd need to send me the link for an explanation of how it works)


    Imke Feldmann TheBIccountant.com

    Wednesday, April 6, 2016 5:21 AM
    Moderator
  • Thanks Imke,

    If you search the forum for Splitter.SplitByNothing Imke you'll actually find several, but this specific function is not the point of my post of course.

    Dale

    Wednesday, April 6, 2016 12:55 PM
  • Thought so. But now that I'm on it:

    This is code that has been generated automatically when I was creating the query, which is often a mixture between using the UI and editing code.

    It belongs to the Table.FromList-function and is optional actually. (I often try to delete optional parameters that have been generated, but these skipped me so far).

    There's 2 sides to the coin here: Love the ability to mix UI and code - but might end up with redundant code bits.


    Imke Feldmann TheBIccountant.com

    Thursday, April 7, 2016 6:21 AM
    Moderator
  • If you search Amazon for "Power Query" you'll find some books that cover introductory M. M is for Data Monkey might be the kind of resource you're looking for.

    Ehren

    Thursday, April 7, 2016 7:43 PM
    Owner
  • Thanks Ehren - I mentioned in my base post that I have that book (Ken Puls) and Chris Webb's book and I read a half dozen blogs and this forum regularly.  M is for Data Monkey does spend some time addressing M syntax, but does not attempt to be complete and is aimed in large part at Excel users trying to adapt to M.  Similarly, Power Query for Power BI and Excel (Chris Webb's book).

    My point is that M is a language with unique syntax and capabilities and would benefit from being documented accordingly - like C, C#, F#, VB, PowerShell and even VBA.  It is at the core of much of the Microsoft BI stack and deserves serious treatment and not just to be treated as the quaint stuff generated by the Get and Transform UI.

    My point as well is that I think that just getting the basics documented would go a long way toward getting a larger audience to understand the potential of the language and open doors to much broader use beyond the code generated by Get and Transform.

    Thursday, April 7, 2016 8:40 PM
  • Ah, sorry I missed that. If the currently available resources aren't cutting it for you, I'd recommend that you simply post your M questions on this forum.

    And I'll pass along the feedback about some kind of introductory explanation of the basic syntax etc. that's more approachable than the language spec.

    Ehren

    Thursday, April 7, 2016 8:50 PM
    Owner
  • Hi Dale,

    Thanks for the feedback! It's great to hear that our advanced users want to get more proficient with M.

    Have you seen the M Introduction on MSDN?

    The section on Values and Operators might help clear up {} vs []

    The section on Functions tries to explain what "each" does.


    I agree that the M language spec and library spec PDFs can be pretty technical, and sometimes they get out of date as we improve the library.

    The library spec PDF is probably the best place right now to learn about text literals. If section 2.6.1 isn't clear enough, let us know!

    We're actively working on improving the MSDN pages, but for right now the best place to find up-to-date library documentation is from within Editor. (If you want the latest and greatest, the current Power BI Desktop release is often a few months ahead of Power Query in Excel.)

    I'll work with the author of the M Introduction to implement your feedback.

    One other resource I didn't see mentioned was Stack Overflow (tag powerquery), which is a great place to get M code questions answered.

    • Marked as answer by Dale Hohm Friday, April 8, 2016 11:20 AM
    Thursday, April 7, 2016 10:15 PM
  • Thanks Ehren and Carl!

    Yes, I do use Stack Overflow and have read through the M Introduction and I have learned what I have about text literals from the Library Spec PDF.  

    My reason for writing this post was only partly out of personal frustration.  I'm all-in on Power Query and M.  I use them every day and I'm an evangelist.  I want resources that I can point others to so that they don't have to keep coming back to me for the "magic".  I believe the M language deserves a comprehensive, structured user document and a function reference.  

    I'd also like to see M elevated to a visible and respected place instead of being hidden behind the Power BI and Get & Transform UIs.  

    The developers have done a great job of quickly evolving the functionality and expanding it's utility (I really miss the monthly release cycle for Power Query).  I cannot use Power BI in my production environment yet, but hope to in the future.

    My personal path is going to be to just bite the bullet and print off copies of the language and formula specs and read and reread until it is all second nature.  I'll make it through.  I just hope that it doesn't have to be so hard for everyone who follows.

    Dale

    Thursday, April 7, 2016 11:01 PM
  • My M experience parallels my DAX experience. Early on (when Power Pivot was Gemini), we all struggled to come to terms with the nuances with the language (quite normal). DAX became easier along the way as folks understood it more, blogged about it more, and engaged with members of the Power Pivot Team. Note that it took six years for a definitive book on DAX to be written, which is no consolation for the M user. :)

    I've mentioned it before on this forum, but most of my understanding of M has come from building a lot of general purpose custom functions to handle "missing functionality" in the language. This site has been a source of inspiration for both solving problems and learning from how other folks solve problems. On rare occasions, we get insider solutions that go beyond anything that's in the documentation, and thus beyond that which we can figure out on our own. Blog sites can often provide interesting insights, the value depending on how much you already know.

    I agree that the M documentation could be better, but its not going to come out of Microsoft. Here is a brief assessment of my experiences with Formula Language Specification.

    1. Section 1.7. Metadata – In M’s entire existence, we’ve seen one example of using a metadata record, and not a useful example at that. The reader can skip the discussion without missing out on anything important.
    2. Section 2. Lexical Structure – Overly formal, tedious, and boring. Lexical structure, lexical analysis, documents, grammar conventions, whitespace, tokens, etc. Phew! I don’t know how I ever got through this section without falling into a coma. Even classical works like “The C Programming Language,” and “The C++ Programming Language” (both books written by the authors of these languages), are far less formal, far more engaging from the start, and don’t use any of this computer science class jargon.
    3. Throughout the document, I’ve found that the abstract italicized hierarchies used to describe structural elements of the language impossible to follow or remember in most cases.
    4. Section 3.3. Environments and variables – Boring, tedious, and at times confusing.
    5.  Section 4. Values – This section is very well done and complete.
    6. Section 5. Types – And section 4 is followed by the worst section in the document! After 3+ years I’m still confused. Examples:
      1. The section begins by describing types. In the first bullet, records, lists, tables, functions are included in the list of primitive types. The subsequent bullets list these types again. Are they supposed to be different from the records, lists, etc. that are already included within the primitive types?
      2. How is a type of “none” generated?
      3. “The primitive-type names are contextual keywords recognized only in a type context. The use of parentheses in a type context moves the grammar back to a regular expression context, requiring the use of the type keyword to move back into a type context.” Eh?
      4. I find the entire discussion of custom types, as well as the type functions baffling. I understand types in the context of function parameters and table columns for instance. However, if you create a custom type for a list e.g. ListType = Type.ForList ({type number}), how do you apply this type to an actual list? And after you apply the type to a list, then what? Same for all of the other type functions.
      5. Where do types like int8, int16 etc. fall? They aren’t mentioned at all.
      6. The section is longer than many others in the document and clearly demonstrates the M language’s obsession with types. Yet, there are zero online discussions I can find that clarifies any of this type stuff. To add insult to injury, the one place I'd want to use custom types in some way is with function arguments, but such use isn't supported.
    7. Sections 6, 7 & 8 are generally well done.
    8. Section 9. Functions -  A entire section dedicated to the discussion of functions, that doesn’t bother to tell you the syntax for passing your own lambda function into another function. Odd.
      1. Section 9.5 Closures – Don’t already know what a closure is, or how to create a useful one? If so, don’t expect to find the answer here. It assumes that you already know the answer. Hint: you can create closures with any of the built-in Splitter, Combiner, or Replacer functions. You can create your own custom closures as well, if you can figure out a useful reason to do so.
    9. Section 10 is well done.
    10. Section 11. Sections – Uses 5 pages to explain useful functionality that doesn’t exist, has never existed, and may never exist. Should be removed.
    11. Section 12. Consolidated Grammar – If this section were removed from the document, I probably wouldn’t notice.

    "I'd also like to see M elevated to a visible and respected place instead of being hidden behind the Power BI and Get & Transform UIs."

    Totally agree. This is an issue that I've thought about considerably. Easiest functional language on the planet!

    • For starters, it would be great for creating UDFs in Excel, without having to worry about all of the VBA macro security BS.
    • Could be used as an alternative to JavaScript, but compile to JavaScript! Great for using the JavaScript APIs in Excel through encapsulation.
    • Could be extended to support GUI stuff by adding reactive functionality ( RxJs anyone?).

    Okay, I'm now out of The Matrix. :)

    "I just hope that it doesn't have to be so hard for everyone who follows."

    Well, unless the available documentation is improved, it will be hard for everyone else that follows - especially Excel folks. Hmmm, it just occurred to me that I don't even know the target audience for the language specification.


    Monday, April 11, 2016 8:38 PM
  • Hi Dale,

    I completely agree with what you're saying here - I think there is a need for a book devoted to the M language. To be honest, when I wrote my book (about two years ago now) I had only just about mastered the basics of the language and didn't have much real-world experience, and that's why I only devoted a single chapter to it. I would love to write the kind of book you're looking for and I may well get around to it at some point - though probably not this year!

    I also agree that the language needs a proper name: the whole M/Power Query Formula Language mess drives me nuts. I have complained to Microsoft though and they were sympathetic, so I hope this gets resolved soon.

    Chris


    Check out my MS BI blog I also do SSAS, PowerPivot, MDX and DAX consultancy and run public SQL Server and BI training courses in the UK

    Thursday, April 14, 2016 12:57 PM
  • Thanks for adding your thoughts Chris.

    I'd pre-order that book today, if you were writing it.

    Dale

    BTW, I've learned more about M/Power Query, DAX/Power Pivot and (perhaps reluctantly) MDX from your blog (https://blog.crossjoin.co.uk/) than from any other single location.  Thanks.

    Thursday, April 14, 2016 9:34 PM
  • hey Dale!

    you're right about the m is for data monkey book. The book is aimed towards the excel users that want to try Power Query or have tried it but want to take their knowledge to a new level. The book is based on the Power Query Workshop and some of the content that you're mentioning can only be found in the workshop and not in the book. This content could come in a future book, but perhaps you might see a blog post about this pretty soon :) btw, you are absolutely right about the language. Is not hard, it just looks complicated because it's always in plain text.

    best!

    Miguel Escobar

    Friday, April 15, 2016 10:20 AM
  • The language specification is intended as a semi-formal description of the language and not as a mechanism for people to learn the language. Unfortunately (as you've said) we don't really have anything like the latter other than the introduction on MSDN. Because the specification is aimed at practitioners, some of the terminology and conventions it uses may seem strange if you're not used to them.

    A "closure", for instance, is just a way of saying "a function which encapsulates some of its surrounding state. So the definition "(a) => (b) => a" is a function which returns another function (that happens to be a closure) and we say that it "closes over" the parameter a to the original function.

    The "italicized hierarchies" as you describe them are basically a "Backus-Naur Form" representation of the language grammar.

    Some other notes:

    Metadata is used to carry extra information "out-of-band" with the actual value of the data. For instance, the documentation you see in the editor by entering a function name is actually represented as metadata on the type of the function. The editor knows how to look for it and format it for display. Similarly, the navigation and function invocation experiences in Power Query are largely driven by metadata.

    Sections definitely do exist, though they're not visible to the end-user in Power Query. If you have a workbook with three Power Queries in it, we store those queries as a single section document with three shared members. You only end up seeing the expressions associated with each of those members, though. If we ever manage to finish a publically-visible library mechanism, you will see sections there.

    Types are confusing in part because our feelings about them have changed with time and necessity. As one example, we formally have only one "number" type -- just like JavaScript. Other "types" like Int32.Type are considered to be facets of the number type, and originally didn't even have a language or library footprint.

    One of the disadvantages of success is that we don't have any time to clean up some of these rough edges. If no one were using the product, we'd have until the project was cancelled to make all of this much prettier :).


    Friday, April 15, 2016 1:36 PM
  • The language specification is intended as a semi-formal description of the language and not as a mechanism for people to learn the language. Unfortunately (as you've said) we don't really have anything like the latter other than the introduction on MSDN. Because the specification is aimed at practitioners, some of the terminology and conventions it uses may seem strange if you're not used to them.

    The problem is that the language spec is the only M language document available to Excel folks, and the official jargon doesn't apply to them. Let's not forget that M (via Power Query) was once the exclusive domain of Excel, so I think we can agree that a less formal document would be useful.

    A "closure", for instance, is just a way of saying "a function which encapsulates some of its surrounding state. So the definition "(a) => (b) => a" is a function which returns another function (that happens to be a closure) and we say that it "closes over" the parameter a to the original function.

    I've used closures for creating custom functions like Text.SplitTextByFixedLength, using Splitter.SplitTextByLength, which takes in a list of lengths, and passes in the text to be split into the Splitter's return function. I needed numbers converted to text to be grouped by 3s, so that I could create a function to spell out numbers as words. However, my comment was about Excel users in general - the vast majority wouldn't has a clue about closures. I think this point is covered in the previous one, so there's nothing more to be said.

    Before leaving the topic, I must admit that in M, I haven't been able to figure out a good reason to create a custom function for the purpose of creating a closure - a knowledge gap I guess.

    The "italicized hierarchies" as you describe them are basically a "Backus-Naur Form" representation of the language grammar.

    Perhaps a bit of overreaction on my part. From the perspective of a formal spec, the  Backus-Naur Form representation probably must be included. From a practical viewpoint, the notation is a distraction. Actually, I'm now beginning to think that the more formal text should have been put in an appendix, instead of being of being strewn all over the more practical aspects of M that are described in the spec. 

    Some other notes:

    Metadata is used to carry extra information "out-of-band" with the actual value of the data. For instance, the documentation you see in the editor by entering a function name is actually represented as metadata on the type of the function. The editor knows how to look for it and format it for display. Similarly, the navigation and function invocation experiences in Power Query are largely driven by metadata.

    Wouldn't it be great if the spec. mentioned that? In practice, what examples are there for creating and applying metadata records as is so painstakingly described in the spec?

    Sections definitely do exist, though they're not visible to the end-user in Power Query. If you have a workbook with three Power Queries in it, we store those queries as a single section document with three shared members. You only end up seeing the expressions associated with each of those members, though. If we ever manage to finish a publically-visible library mechanism, you will see sections there.

    Another invisible feature! Your last sentence doesn't sound very encouraging :(

    Types are confusing in part because our feelings about them have changed with time and necessity. As one example, we formally have only one "number" type -- just like JavaScript. Other "types" like Int32.Type are considered to be facets of the number type, and originally didn't even have a language or library footprint.

    Just to be clear - I find only the description of custom types to be confusing. Are you saying that we can safely ignore them?

    One of the disadvantages of success is that we don't have any time to clean up some of these rough edges. If no one were using the product, we'd have until the project was cancelled to make all of this much prettier :).

    What you need is a team dedicated to the ongoing development of M, where trunks could be spun off from time-to-time to meet the current needs of the host applications.




    By the way, thank you for taking the time to respond. :)
    Friday, April 15, 2016 6:19 PM
  • Hi Dale, I have just started trying to use Power Query/M and agree with every point you have raised. I too cannot find any worthwhile documentation for practitioners iworking in the BI space - and I'm very frustrated because : a) I suspect this scripting language could be a tremendous step forward for solving a number of everyday I ssues in the field, and b) I don't have any choice but to learn it because both Excel pivot tables & Power BI use it !! Is there any positive update in the last year? Many thanks, Peter
    Wednesday, August 23, 2017 8:33 PM
  • Hello folks!

    I am also searching for a 'better' way to learn Power Query (M). I see that Chris is offering some training courses in 2017 that I am going to look into.

    Have any other resources popped up in the last year?

    Thanks, H

    Friday, September 1, 2017 7:39 PM
  • I would just like to say that if you ignore the grammar parts of the language specification it is actually quite suitable as an introduction to the language for those who are familiar with another programming language.

    It explains the language in quite a simple manner I think, and provides plenty of examples.

    The language is quite simple and elegant, but the main part is learning what is available in the standard library. That is where M / Power Query lacks some discover-ability; good intellisense support would help with that. (Record.ToTable(#shared) helps though)


    • Edited by gentk Sunday, September 10, 2017 12:44 PM
    Sunday, September 10, 2017 12:43 PM
  • I'll add one more vote to getting more visible and coherent learning materials and references for M. I'm a surface-scratching Excel user who's learned enough to get around, but could use a more structured guide to get deeper and to begin optimizing my queries, build more functions and get a better understanding under the hood.

    I've gone about as far as the basic resources can take me; I'll be digging into the resources some of you folks have mentioned (and written), but it would be really helpful for Microsoft to put something out for those of us who would be generally classified as casual users who need to go a little beyond the basics without making the languages our job focus...

    Tuesday, November 14, 2017 9:00 PM
  • Thanks for this cool code.
    Friday, November 17, 2017 12:40 PM