Parsing complex Word documents into a database.

Hello. I have some files/documents on .doc/docx format that I'm trying to manipulate. They are divided into chapters and sub-chapters, and ideally, I would like to read the chapters by number into a database. Since the format varies between files and the include tables and sometimes mathematical expressions, this is turning out to be a pretty complex task. Has anybody worked on something similar that can recommend an article/a library or anything to get me started? Any ideas or suggestions are more than welcome.

by Gnzzz via /r/csharp

Leave a Reply