Screen scraping function for return YTD?

donheff,
Just one more guess, but which editor do you use in the forum?
There is a drop-down in UserCP for that...
I don't know what UserCP is. I just use the reply panel that is part of the forum. I am not aware of any alternative. But the problem appears to be my browser's handling of what it perceives as a dangerous character string. It simply removes the dangerous characters from view (or copy). Since it happens with some browsers and not others, it would appear the problem is on my end.

This isn't something I am concerned about since it would rarely be an issue and it was easy to work around.
 
I experimented a bit with those extra long URLs. It seems like everything after the ticker symbol is just controlling the output format. We don't care about fonts and whether tables have grids and whatnot, so we can just delete all that stuff.

I updated my spreadsheet to use these URLs instead of the long ones and concatenate in the exchange and symbol:

-http://performance.morningstar.com/perform/Performance/fund/trailing-total-returns.action?&t=XNAS:VTIAX
-https://quotes.morningstar.com/fundq/c-header?&t=XNAS:VTSAX
-https://etfs.morningstar.com/etfq/quote-banner?&t=ARCX:VTI

Sheet Here
Your sheet is very handy. I just got around to implementing the YTD formula, since my googlefinance returnytd was way off, being based on end of November.

In transferring the appropriate formula to my existing spreadsheet, I ran into a problem. In my receiving spreadsheet I added a column for Exchange. My VTSAX row worked as expected, but VTIAX row returned N/A error.

After a few more cups, I reset the column to Automatic format, and there were no errors.

Another unexpected thing is that I do not need to enter an exchange, such as XNAS, for any of my funds. I still have the reference to the cell, but cell is empty, and the formula still works.

All's well that ends well. At least for now.

Thanks for this formula.
 
Cathy, if you are still following this thread maybe you can help me fix my TSP daily price function.

For years I have used the below quoted ImportHtml function to pull in the daily TSP G Fund price:
=INDEX(ImportHtml("https://www.tsp.gov/investmentfunds/shareprice/sharePriceHistory.shtml", "table",), 2, 7)
Hopefully, quoting it will keep special characters from screwing up in the post.

A few days back the function started returning a reference error. I checked the URL for the table in question and it now redirects to a server that is not available. Since the TSP share price page (https://www.tsp.gov/InvestmentFunds/FundPerformance/index.html) still displays the data in the same table format I assume they have moved the data source. I tried using Chrome Dev tools to find out what is going on but was not successful. The page does not appear to be using an XML function like Morningstar to grab the data and I can't find any way to find where the supporting table is.

If Cathy or anyone can figure it out please let me know.
 
Try this:


I also added it to that same spreadsheet we were using before: https://docs.google.com/spreadsheets/d/1n0EFXeJRG-P5h5U8B5X97Kcs6sFTmxGnoeicwYtp98A/edit#gid=0
Excellent. But the line you posted above doesn't work. You have it a little different on the spreadsheet which does work:
=importxml("https://www.tsp.gov/InvestmentFunds/FundPerformance/index.html","//table[1]/tr[2]/td[7]")

I tried to find out if the TSP was using an appraoch similar to Morningstar to pull the data but when I used Chrome Dev tools Network>XHR which showed the source tables for M* I got nothing from the TSP page. Does your importxml function simply look through the index.html page for an appropriate table?

Thanks, once again.

Edit: I can still see that the line is not displaying as copied from the spreadsheet. Let me try it quoted:

=importxml("https://www.tsp.gov/InvestmentFunds/FundPerformance/index.html","//table[1]/tr[2]/td[7]")

Now all the lines are displaying properly. Gremlins in my browser.
 
Last edited:
All the functions in my post and yours look the same to me in Chrome. I do see the same issue you describe in MS Edge though. When my computer is busy, I can actually see it draw the correct thing on the screen and then hide some of the characters, so it seems like the browser itself is suppressing it. No idea why. Anyway, the spreadsheet seems like a more reliable way to share these functions.

To answer your question though, this function is retrieving the price from the index.html page because TSP is not using the same mechanism that Morningstar uses on their pages. Morningstar loads an HTML doc that just contains the graphics and layout and then fills in the data with the XHR calls. If you ask for different data, it just makes more XHR calls and fills in the existing doc with new data.

TSP is loading the entire page in the initial doc. If you go to Network>Doc in the Chrome Dev tools and look at the source there, you'll see that it contains all the numbers that are displayed on the screen. If you change the date range and click Retrieve Share Prices, it loads an entirely new doc.
 
Thanks Cathy. Your importxml approach works fine. I now see that if I replace the new index.html address in my old ImportHtml function that works as well.
 
Double forward slash may be getting interpreted as start of a comment by some browsers? Or maybe some hidden unicode character is messing with the browser display.
 
Back
Top Bottom