Download XML file using Selenium
I managed using a VBA script within Excel with the Selenium Add-on to login to a website and everything works fine.
After I am logged a XML-file is loaded in the Browser which I want to download to a local file to be further analyzed in Excel.
How can I download the XML-file line by line or otherwise the full website to a file using Selenium? Currently all efforts to scrape the contents failed because I do not get a match with the FindElemenyBy... methods - the information displayed is XML and not HTML :-(
Do you have any idea how I can save the XML file to a local file?
Thank you very much for your support on this issue.
You can write it to a local file using plain VBA. Let's say you want to download this XML file - https://bin.codingislove.com/raw/vezawabegi
Then code looks like this :
Sub downloadXmlSelenium() Dim bot As New WebDriver bot.Start "chrome", "https://bin.codingislove.com" bot.get "/raw/vezawabegi" myfile = Application.ActiveWorkbook.Path & "\result.xml" MsgBox (myfile) Open myfile For Output As #1 Write #1, bot.FindElementByTag("body").Text Close #1 bot.Quit End Sub
Thank you ver much for your reply, this seems to be a nice way, but unfortunately does not work.
The site I'm using does not embed the XML-file in HTML-tags, therefore there is no "body" tag to be used for the extract. However, Chrome provides some basic HTML-structure even if a plain XML-file is provided by the server and I could grab the class "pretty-print" to extract the XML-details but as I am usually using firefox this does not work. In addition to this the XML-file contains > 5.000 entries so working on the "body" tag or "pretty-print" class does take a very long time...
Isn't there any other solution to save the whole website to a local file instead of grabbing only defined parts of it?
Thank you very much for your support,
You are looking at a page which fetches XML from some API, renders XML and formats it using CSS. The best solution is to find the source URL of XML file by inspecting XHR requests of that page. Once you find the source, Make a simple GET request and grab the XML.