Fidelity Html Statements: Gone?

I downloaded an account PDF from Schwab. I opened it with my ancient copy of Acrobat 9. Exported HTML, and it has the information in HTML tables.

I think you can craft a solution in any OS using this insight and some extra tools.
I exported HTML from the Fidelity PDF using PDFBox Java library, and everything was in < p > tags, not tables. There's another library, iText, which I haven't investigated. As much as one might suggest "it can be done", that doesn't say "it can be done reliably and without too much coding". Here's a stack overflow answer that came up just now in my search for iText:
There is essentially not an easy cut-and-paste solution because PDF isn't really very interested in structure.

If you want to do this in PDF itself (where you would have the majority of control over the process), you'll have to loop over all text on pages and identify headers by looking at their text properties (fonts used, size relative to the other text on the page, etc...).
On top of that you'll also have to identify paragraphs by looking at the positioning of text fragments, white space on the page, closeness of certain letters, words and lines... PDF by itself doesn't even have a concept for a "word", let alone "lines" or "paragraphs".
To complicate things even more, the way text is drawn on the page (and thus the order in which it appears in the PDF file itself) doesn't even have to be the proper reading order (or what us humans would consider to be proper reading order).
I download transactions using Moneydance, which uses OFX.

I wonder if you could use that for transaction downloads in a custom app?
I saw OFX mentioned in one of the reddit posts. I poked at it just for a minute, and it is apparently is a 3rd party that I'd need to trust with my credentials. So far in life, I've not gone down that road.
 
I saw OFX mentioned in one of the reddit posts. I poked at it just for a minute, and it is apparently is a 3rd party that I'd need to trust with my credentials. So far in life, I've not gone down that road.

I don't think that's accurate. These programs (Quicken/Moneydance) appear to use OFX Direct. This allows a direct connection between your computer and the financial institution. For example, I see that Schwab Bank uses a direct connection to 'https://ofx.schwab.com'.

I think the only 3rd party part is the OFX database that stores financial institutions and their details. Once you know those details, I don't think you need to connect to the 3rd party site, you'd only connect directly with the financial institution.

However, looking at the Moneydance output, it looks like Schwab isn't working. I'm getting an 'access denied' error and reddit says that Schwab pulled OFX/Direct Connect support. So much for that. Even though there's a post on the reddit thread that indicates Schwab discontinued OFX and is going to provide their own APIs (https://www.reddit.com/r/GnuCash/comments/jyc66b/schwab_ofx_discontinued/).

As for Fidelity, OFX/Direct Connect still works and it connects to 'https://ofx.fidelity.com'.

If you're using python, there's an oxftools library that you could use. It looks like you have to enter your username/password and you'd want to review the code to make sure it's not doing anything nefarious. Here's some documentation which shows how it works: https://ofxtools.readthedocs.io/en/latest/client.html.

Of course, this isn't for everybody, but if you're ok with how it works and spend time to understand what it's doing, it could be a solution.
 
I used to scrape the websites themselves, after hardcoding my login/password into a "script" using a selenium webdriver back in the day. It was slick, i could have the script run upon windows login and automatically export to a google sheets file. Then MFA(Multi factor authentication) and tokends became a "thing". No more real time-automagical updates. It was only an issue with the mutual funds so I just sold them all, bought into ETF equivalents and use google sheets to =GOOGLEFINANCE(VUG,"price") function. It works. For now.


Technology is always changing.
 
I exported HTML from the Fidelity PDF using PDFBox Java library, and everything was in < p > tags, not tables. There's another library, iText, which I haven't investigated. As much as one might suggest "it can be done", that doesn't say "it can be done reliably and without too much coding". Here's a stack overflow answer that came up just now in my search for iText:
I saw OFX mentioned in one of the reddit posts. I poked at it just for a minute, and it is apparently is a 3rd party that I'd need to trust with my credentials. So far in life, I've not gone down that road.
Step 1 - Open your pdf in text editor. Check the pdf version. Use this when investigating libraries and apps. Every library doesn't keep up with pdf version increments.

Clipboard02.jpg

Step 2 - Use Acrobat (I used 9.x) to export html. I had 2 choices - HTML 3.2 or 4.0 with CSS. I chose the 1st option and you see the result in picture below.

Clipboard01.jpg

The screens show that pdf does have structure, although a particular library or tool may not be current, or the transform is elementary (like all p's). In this case Adobe engineers are smart enough to get the tabular data correctly into HTML elements.

The structure in a pdf is meant for viewing or printing. So it has structure dedicated to that purpose. Acrobat is a tool which can change the viewed file structure to various export formats (like HTML). Other apps and libraries have this function too.

Acrobat also has a tool for selection of the text on screen, with save table, copy table, and open table in spreadsheet. I use that feature a lot.

There are other apps and libraries which perform like this, but I'm sure there's a bit of research and trial. Since Google knows more than Reddit, I might try the first result that comes up when I look for "pdf2html". PDFTRON is the company, I think.

My Step 2 gets a result, which is what I do every month. OTH, you are a programmer and want to include as much logic and automation as possible. I hope this all helps you break out and find a complete solution.
;)
 
I don't think that's accurate. These programs (Quicken/Moneydance) appear to use OFX Direct. This allows a direct connection between your computer and the financial institution.
Thanks for correcting my interpretation of how ofx worked. It could very well be "the right approach".
Then MFA(Multi factor authentication) and tokends became a "thing"....
Interesting that you mention that part of your journey. Last night before drifting off, I had a research idea...what if I could do authentication manually, with tokens or annoying captcha challenges, then, once authenticated, let the automation I write take over (riding on the manually induced authentication). Might need to be done in a browser plug-in, which is an environment where I've never attempted to code anything. The way I do it today is I have a browser plug-in (KeePassXC now, was LastPass before). I authenticate with the password manager manually, then my scripts don't need the password hard coded because the password manager populates the credentials automatically. That doesn't work if there's an email loop, but those challenges are rare. I don't have the second factor set-up, so don't have to deal with that, but some day they might make it too painful not to have the second factor set-up.
Step 2 - Use Acrobat (I used 9.x) to export html.
If PDFBox gave me the level of formatting you got from Acrobat, I'd be more optimistic. What I've been reading about PDF to HTML converters, the results vary considerably...there's significant artistry involved, and a minimal approach, as PDFBox apparently takes, doesn't take it far enough. I suspect that the Acrobat you use is not the free "Reader", but a product you bought at some point. Nowadays, they probably make you rent / subscribe if you want the latest version. eBay has used Acrobat 9 CD's for $40.
 

Attachments

  • pdfbox.jpg
    pdfbox.jpg
    58 KB · Views: 15
I don't think that's accurate. These programs (Quicken/Moneydance) appear to use OFX Direct. This allows a direct connection between your computer and the financial institution. For example, I see that Schwab Bank uses a direct connection to 'https://ofx.schwab.com'.

I think the only 3rd party part is the OFX database that stores financial institutions and their details. Once you know those details, I don't think you need to connect to the 3rd party site, you'd only connect directly with the financial institution.

However, looking at the Moneydance output, it looks like Schwab isn't working. I'm getting an 'access denied' error and reddit says that Schwab pulled OFX/Direct Connect support. So much for that. Even though there's a post on the reddit thread that indicates Schwab discontinued OFX and is going to provide their own APIs (https://www.reddit.com/r/GnuCash/comments/jyc66b/schwab_ofx_discontinued/).

As for Fidelity, OFX/Direct Connect still works and it connects to 'https://ofx.fidelity.com'.

If you're using python, there's an oxftools library that you could use. It looks like you have to enter your username/password and you'd want to review the code to make sure it's not doing anything nefarious. Here's some documentation which shows how it works: https://ofxtools.readthedocs.io/en/latest/client.html.

Of course, this isn't for everybody, but if you're ok with how it works and spend time to understand what it's doing, it could be a solution.
Just to follow-up on the work-around...

I started researching the OFX option, and it appears to be workable. My approach was to find something that worked already (plug in Fidelity account number, userid and password and magically get output without trial and error of the OFX interface). I tried a FOSS kmymoney, but couldn't get it to work. I put my problem on their forum, but moved on. I got a simple jar file called OfxExplorer to communicate with Fidelity, but all it would do is send back a list of accounts. I didn't want to become an expert in how to craft the request to get a statement. It's probably not that hard, but in the past, things like that can take a long time because you have to get 20 parameters perfect, or you get an unhelpful error and you're left wondering. So I downloaded a Java project called OFX4J and got it to build. There was a class "DownloadStatement", so I added a main method and gave it the parameters it asked for. It had a weird bug where it wasn't setting the language, and Fidelity apparently required it (generic error like "it didn't work"), so I added a .set("en_US"), which is the programmer's way to specify language. Nope, it's .set("ENG"). But after that, it worked!

So just got it working for one account, after-tax cash, but I suspect I can get it working for everything.
 
I always download the PDF of my statement. It never occurred to me to use the HTML, but I can certainly see how it would be useful if you want to reformat things. Have fun with your new coding efforts!

Same here. I always downloaded the PDF.
 
Since you have figured out how to get OFX from Fidelity, I have had a problem for many years with data download (OFX) from Fidelity in Quicken. Fidelity usually updates the data feed every day sometime after midnight ET, so that's usually when I download the day's data and transactions in Quicken. However on the last day of the month, transactions are often missing and I have to manually reconcile to the web site.

Once upon a time, I tried looking at the OFX files and did not get anywhere. I also tried to ask about this on Quicken's sorry excuse for support forums, and the power-user nerds there dismissed the question. I don't have this problem with Schwab or E-trade or any of the banks I use.

Anyway, if you notice anything funny in Fidelity's OFX involving transactions or interest payments on the last day of each month, I'd like to know more about what the problem might be.
 
Glad to hear you made progress with OFX downloads. Out of curiosity, any reason you didn’t try the python solution?

Java seems like a lot of overhead for me. I tend towards low-level sw dev and anytime I need something higher level, python is my go to language.

In response to Larry, I’ve been using OFX downloads with Moneydance for years and never had any issues with transactions, so it seems to me that it’s a Quicken issue.
 
^ I already had a Java IDE set-up and running. Because that's what I did for a living, I can read and write Java without thinking too hard. And my html scraper was written in Java. The other financial institutions will run the same way they always have, only Fidelity will change.

But yes, lots of overhead. The OFX4J package is huge, uses 7 libraries (although "everybody" uses apache logging and junit, but 5 more) and has way more functionality than I need.

Anyway, if you notice anything funny in Fidelity's OFX involving transactions or interest payments on the last day of each month, I'd like to know more about what the problem might be.
I'll follow-up on any issues I have with OFX at Fidelity.
 
In response to Larry, I’ve been using OFX downloads with Moneydance for years and never had any issues with transactions, so it seems to me that it’s a Quicken issue.

My theory based on practically no evidence, is that Fidelity may be incorrectly logging the timestamps on these transactions, in such a way that Quicken refuses to recognize them because it thinks it's already downloaded them.
 
My theory based on practically no evidence, is that Fidelity may be incorrectly logging the timestamps on these transactions, in such a way that Quicken refuses to recognize them because it thinks it's already downloaded them.

This is a longshot guess, but might they be using a GMT timestamp and not a local one? I ran into something similar in a project for a community organization with a different system, transactions we thought should be included on the last day of the month were not being found, and falling into the next month. We finally figured out (duh) that the system was using a GMT timestamp for transactions, and we had to account for that when downloading.
 
So just got it working for one account, after-tax cash, but I suspect I can get it working for everything.
Spoke too soon. The OFX4J package had DownloadAccounts, and DownloadStatment, but not DownloadBrokerage. It had classes to support brokerage, so I wrote a DownloadBrokerage, but, my nightmare, it says "bad request" no other hint of what's wrong. Exactly what I didn't want to do...fiddle around at the API level.
 
Glad to hear you made progress with OFX downloads. Out of curiosity, any reason you didn’t try the python solution?
Given the trouble with OFX4J that I'm having, I downloaded the Python solution (ofxtools).

The good news is that I'm getting data! But not enough data to recreate the statement, stand-alone, as I had with the html statement. The problem is ofx doesn't do balances, except for "now". So positions are priced as of the time of the pull, not on the last day of the month.

There's a start and end date, so that can be, for instance, 20210901 and 20211001, and you get the appropriate transaction set. For positions, although it accepts as of date as a parameter, that is ignored and you always get the latest price. The doc page even says --asof is typically ignored. So you can apply transactions to a balance that you start with (which is what Quicken, etc probably do), but it doesn't give you an ending balance to validate against. If one uses Quicken, and the like, to pull investment data, I wonder how they handle this issue. Does the reconcile process require end of month prices to be entered? Or maybe they do a separate pull of historical closing prices for publicly traded securities?

I could try to always have it pull the data on the last day of the month after the market closes, I suppose. It is more feasible to use ofx as a scheduled job because it doesn't need to go through the (occasional) email loop and other such gyrations of proving a human is involved.
 
If one uses Quicken, and the like, to pull investment data, I wonder how they handle this issue. Does the reconcile process require end of month prices to be entered?

I try to reconcile my banking and credit card accounts at the end of every month. Maybe I use Quicken differently than most, in that I enter every transaction manually and reconciling is just a sanity check.

I have never tried to reconcile a brokerage account. It is available, it asks for a statement date, prior cash balance and ending cash balance.

However what I always do periodically, is go to the "online center" in Quicken, and for each brokerage account, do "Compare to Portfolio". This compares the names and amounts of all securities (i.e. everything BUT the cash). (I assume this compare is to the most recent OFX download.)

I very infrequently (annually or less) manually compare the cash balance on my statement to the cash balance in Quicken. Every now and then, there is a discrepancy to chase down. Maybe that is where "reconcile" would help me.
 
Thanks LM. It appears that Quicken doesn't attempt to match statement balance for brokerage positions, but keeps track of the number of shares. And if you put in the end of month price, there's probably a way to prove to yourself the value in Quicken matches the value on the statement.
 
I’m happy to hear the python library is working.

As for reconciling, this is something that I do manually on a regular basis using Moneydance. I regularly download transactions/quotes and compare the balances in my brokerage account to what’s in Moneydance.

I suspect that you’re right and they use the stock price and number of shares to determine balance. I’ve never had a discrepancy in Moneydance, so I’ve never thought about how it works underneath.

It’s easy enough to track this in finance program, since they have historical data to lean on and the ability to get stock quotes. In your case, it you’re downloading monthly statements that only give you transactions, it’s a little harder. In that case, you could peak into the previous month’s statement to get the ending balance and use that for the current month. But I’m sure you’ve already thought about how to solve that problem.
 
Last edited:
As for reconciling, this is something that I do manually on a regular basis using Moneydance.
When I saw MoneyDance was $50, I was going to jump on it, but I see it's a subscription to MoneyDance+ to be able to download, and that's another $60 per year subscription. I have avoided subscription software as a matter of principle unless it's justified. If it were just software on my system that talks to Fidelity, then there's no reason for a subscription. And if it's anything that requires interaction with MoneyDance cloud servers, a subscription would be justified, but you said the OFX was between the client system and the financial institution.

But poking around at the OFX files, I realize the coding is going to be very tedious. Too bad kmymoney is buggy...that would have given me what I need without writing a solution that maintains balances, picks through the partial data of the OFX data, and looking up end of month prices.
 
When I saw MoneyDance was $50, I was going to jump on it, but I see it's a subscription to MoneyDance+ to be able to download, and that's another $60 per year subscription.

I have Moneydance and no issues with downloads. I avoid subscription services and didn't hear of Moneydance+ until you mentioned it.

From their blog about Moneydance+:

First, let me say upfront that Moneydance is not going subscription only and never will. If you buy a license, you can use that version (and usually the next major update) for as long as you like. In addition, we have neither the desire nor even the ability to deactivate or “sunset” features.

The Moneydance+ service does require a subscription, but it is an additional service that is purely optional. The subscription is necessary to cover our costs for connecting to the aggregator, Plaid. The previous online banking system in Moneydance will remain in place and free of charge for as long as there are banks with OFX servers to which Moneydance can connect.

https://infinitekind.com/blog/moneydance-plus-privacy-subscriptions

The blog entry is worth reading for anyone that deals with direct downloads using OFX into their personal finance software. It sounds like the days of OFX might be limited, which ain't great. At some point, I need to read up on aggregators so I understand how they work.
 
Not impossible, but I could probably manually type it for two years before I'd break-even coding a PDF scraper. And then it would probably break every time I had a transaction type the scraper hasn't "seen before". Like I said, there aughtta be a law where they're required to supply a machine readable, full-detail, statement. There IS a law where they're required to supply a paper statement with certain specifications...that law needs to be expanded to force them into putting that same thing into a machine readable file. I voted with my mouse: https://digital.fidelity.com/ftgw/digital/edelivery/ clicking "US Mail" on everything. If I'm going to have to type it in, I'm certainly not going to print it myself.

You don't have to type it in FreeOCR - actually free - can read most text images and provide plain text output. Scan the paper document, OCR it and then parse the plain text.
 
If you know Python, you could parse out that data from the pdf.
I'm just learning Python myself. But, I know it can be done from watch You Tube videos.
I am learning Python too. I will tell you that I am an engineer and familiar with programming, and I am struggling to manipulate pdfs with Python. I have watched YouTube videos and taken 2 classes. I am going to have to ask my kids for help, lol.
 
You don't have to type it in FreeOCR - actually free - can read most text images and provide plain text output. Scan the paper document, OCR it and then parse the plain text.
So go from data, then to a printout, then to an image, and finally back to data again. Oh, my! How nuts has the world become? Why don't the financial institutions just allow a complete statement download? They have the data, as proven by the existence of the PDF. And now they're threatening to shut-off even the OFX tap (that's already flawed, in that it ignores "as of date"). What's going on, anyway?
 
I am learning Python too. I will tell you that I am an engineer and familiar with programming, and I am struggling to manipulate pdfs with Python. I have watched YouTube videos and taken 2 classes. I am going to have to ask my kids for help, lol.
Which financial institution(s)?
 
Probably Just Go With Web Automation

I've been using an image matching based automation program (Sikulix) to scrape a snapshot from financial sites, and I think I'll just expand that to build a Fidelity statement. The automation, in theory, works without intervention, but the financial institutions are always changing stuff, so the code always requires tweaking. But it generally works, and I see that I can download all transactions as a cvs (transactions were not the issue, really) and I can download positions csv as of the date I run the automation. If I happen to run it after the market closes on the last day of the month, and before the market opens the next day, it should match the statement. If I miss that, I'll update the CSV, over typing prices into the CSV from the hard copy I'll soon be getting in the mail. Then I'll shred the "book" they send me...the statement I print is a single page. It contains actually more information because it includes the opening position value and share count and price. Plus it's got every transaction with details. I don't need or want all that other crap. Do people really use that stuff?
 
Last edited:
Back
Top Bottom