none
R Script Help RRS feed

  • Question

  • Hi All -

    I am new to R services and I am trying to achieve to load pdf data with multiple pages into SQL server using below code that I found online. It works fine with 1 page pdf but not with multiple pages in a pdf.

    I want to achieve loading multiple pdf pages using strSplit(). May be I missing something in the code.

    Any help would be greatly appreciated.

    EXECUTE sp_execute_external_script
    @language
    = N'R',
    @script
    = N

    # load packages
    library(pdftools)

    # Read PDF
    text <- pdf_text("C://temp//AnyPdfName.pdf")

    # Split new lines
    text2 <- strsplit (text, "\n")

    # Return Output
    OutputDataSet = data.frame(text2)
    '

    with result sets (("PDF Text" varchar(max)))

    Thank you in advance.

    RH


    sql


    • Edited by sql9 Tuesday, February 19, 2019 11:11 PM text
    Tuesday, February 19, 2019 11:10 PM

All replies

  • Hi!

    When you say it doesn't work with multiple pages, what error are you getting (if any)?

    Also, have a look at the pdf_data function. This returns one frame per pdf page.

    This evening I'll see if I can spin up pdftools, and see if I can get it to work.

    Niels


    http://www.nielsberglund.com | @nielsberglund

    Wednesday, February 20, 2019 3:34 AM