We use cookies to give you the best experience on our website. If you continue to browse, then you agree to our privacy policy and cookie policy. (Last updated on: November 16, 2018).
Unfortunately, activation email could not send to your email. Please try again.
Syncfusion Feedback

Parsing Table of Contents

Thread ID:

Created:

Updated:

Platform:

Replies:

104657 Aug 6,2012 03:03 PM UTC Aug 13,2012 03:50 AM UTC WinForms 3
loading
Tags: DocIO
Steve Aspey
Asked On August 6, 2012 03:03 PM UTC

I'm writing a parsing routine to parse Word documents to a proprietary object hierarchy used in my program. 

I am having trouble with recognising Word Table of Contents.  In the older DOC format, I can use EntityType.TOC to detect when I am dealing with a TOC but in DOCX the exact same document (saved as DOCX) and the exact same code fails to detect the TOC using EntityType.

 Here's a simple example 

    Public Sub ExampleParse(myfile As String)

        Dim doc As New WordDocument(myfile)
        Dim TOCEntity As ParagraphItem = Nothing

        For Each section As WSection In doc.Sections
            For Each paragraph As WParagraph In section.Paragraphs
                For Each item As ParagraphItem In paragraph.Items
                    Select Case item.EntityType
                        Case EntityType.TOC
                            ' I get here with a Word DOC file but never with the exact same file in DOCX version
                            TOCEntity = item
                        Case Else
                    End Select
                Next
            Next
        Next

    End Sub


Anyone got any ideas on why the DOCX version of the same file has this problem?  I've attached a two test documents as examples.  Set myFile to equal the name and path of one or other of the test Word documents. 

I see that the WordDocument class has a Friend member called TOC but I can't get access to that because it is marked Friend and not Public.


toc test_41d18709.zip

Ramkumar M [Syncfusion]
Replied On August 7, 2012 12:43 PM UTC

Hi Steve,

Thank you for your interest in Syncfusion products.

On analyzing your docx format document, we found that the TOC filed preserved inside a StructureDocumentTag entity instead of with in a paragraph. Currently DocIO only provides only preservation support for StructureDocumentTag entity, so that it is not possible to loop through this entity to get TOC. As a workaround for this problem please try to resave the document by some other version of MS word or use DocIO, to preserve TOC with in a paragraph. For your reference please find the resaved document with TOC preserved inside a paragraph from the following link

Resaved Document:                                    

http://www.syncfusion.com/downloads/Support/DirectTrac/96973/toc%20test-resaved-1222843957.zip

Please let us know if you have any other questions

Regards

Ramkumar


Steve Aspey
Replied On August 9, 2012 06:45 PM UTC

Thanks for the quick reply.  I'm not clear on one thing.  If, for example, I have a Word 2010 format DOCX and I use DocIO to open and resave it in a different format, I'm assuming the only choice I have is to resave it as say Word 2007.  Is that how you created the resaved file or did you use MS Word and saved as another format? 

Thanks

Ramkumar M [Syncfusion]
Replied On August 13, 2012 03:50 AM UTC

Hi Steve,

Thanks for your update.

We have resaved your Doc format document ( toc test.doc ) as Docx document using DocIO ,but not the Docx format( toc test.docx ). If you resave your docx formatted document the TOC field will not preserve inside the paragraph instead of that it will remains as it in StructureDocumentTag (Content control) .

Please let us know if you have any question.

Regards

Ramkumar


CONFIRMATION

This post will be permanently deleted. Are you sure you want to continue?

Sorry, An error occured while processing your request. Please try again later.

Please sign in to access our forum

or the page will be automatically redirected to sign-in page in 10 seconds.

Warning Icon You are using an outdated version of Internet Explorer that may not display all features of this and other websites. Upgrade to Internet Explorer 8 or newer for a better experience.Close Icon

;