Text Replace with regex

I have this text:

Your individual investor number is: <PolicyNumber>


<depositnotice>

Please deposit your lump sum investment directly into the bank account below and use your investor number as your reference followed by proof of payment.

</depositnotice>


<debitorder>

Your <debitorderfrequency> investment will take place on <debitorderdate> for the value of R<debitorderamount>.

</debitorder>


I am trying to use DocIO to remove the section <debitorder> ... </debitorder> from the word doc with the regular expression option. Yet it just does detect the sentence, while the regex is working when testing in online tool.


Code:

WordDocument document = new WordDocument(new FileStream("Input.docx", FileMode.Open, FileAccess.Read));
var res = document.Replace(new Regex(@"<[\w \=]{4,}>(.*?)<\/[\w \=]{4,}>", RegexOptions.Multiline | RegexOptions.IgnoreCase), "");
var res2 = document.Replace(new Regex(@"<debitorder>(.*?)<\/debitorder>", RegexOptions.Multiline | RegexOptions.IgnoreCase), "");


 


5 Replies

SB Suriya Balamurugan Syncfusion Team June 12, 2023 12:40 PM UTC

Hi gert,

From the given details, we have found that your end requirement is to find and replace the pattern of text in the Word document using Regex pattern.

On further analysis the given text content, we have found that you find a text which extends to several paragraphs. To find and replace the text which extends to several paragraphs, we suggest you to use the ReplaceSingleLine API with Regex pattern. Please refer the below modified code snippet to achieve your requirement,

WordDocument document = new WordDocument(new FileStream("Input.docx", FileMode.Open, FileAccess.Read), FormatType.Docx);

var res = document.ReplaceSingleLine(new Regex(@"<[\w \=]{4,}>(.*?)<\/[\w \=]{4,}>", RegexOptions.Multiline | RegexOptions.IgnoreCase), "");



var res2 = document.ReplaceSingleLine(new Regex(@"<debitorder>(.*?)<\/debitorder>", RegexOptions.Multiline | RegexOptions.IgnoreCase), "");


Please refer our UG documentation link to know more about find and replace a pattern of multiline text using Regex,
https://help.syncfusion.com/file-formats/docio/working-with-find-and-replace#find-and-replace-a-pattern-of-multiline-text

Regards,
Suriya Balamurugan.

If this post is helpful, please consider accepting it as the solution so that other members can locate it more quickly.



GE gert June 13, 2023 10:13 PM UTC

You are amazing! 

Thank you for professional product and service. That one line of code changed my world!!



JS Jayashree Suresh Anand Syncfusion Team June 14, 2023 06:46 AM UTC

Hi gert,

You're welcome. We are glad that the provided code snippet resolved the issue on your end. Please get back to us if you need any further assistance. We would be happy to help you. 

Regards,

Jayashree 



GE gert June 14, 2023 10:54 AM UTC

I did come across another question. Should the template text document include a "table", the replace singleline remove all the text, but the empty table stay in the document. For instance in the below, I am required to remove all text & table between < recurringwithdrawal> and </ recurringwithdrawal>. How would I achive that?



<recurringwithdrawal>

Your first recurring withdrawal will take place on <recurringwithdrawaldate> for the value of R<recurringwithdrawalamount> or <recurringwithdrawalpercent> percent.


Please note that we will be transferring the proceeds of your recurring withdrawal into the following bank account as per your application:


Account Name

Account Number

Bank

Branch

Branch Code

<<RWAccountName>>

<<RWAccountNumber>>

<<RWBank>>

<<RWBranch>>

<<RWBranchCode>>


</recurringwithdrawal>



SB Suriya Balamurugan Syncfusion Team June 15, 2023 06:38 PM UTC

gert, our Syncfusion Word (DocIO) library doesn’t consider the table while replacing the multiline text.

To achieve your requirement, we suggest you to find the pattern of text selection using FindSingleLine API and add bookmark to that text selection
. Then replace the bookmark content with the required text using ReplaceBookmarkContent API.

We have prepared the sample application to meet your requirement and it can be downloaded from the below attachment.

Note: Please find the input Word document in the “Data” folder of the attached sample application.

In this attached sample application, we have done the following things,
1. Open input Word document.
2. Find pattern of text selections using FindSingleLine API.
3. Add bookmark start before the first selection text and add bookmark end after last selection text.
4. Navigate to the bookmark and replace the bookmark content with required text using ReplaceBookmarkContent API.
5. Delete the bookmark from the Word document.
6. Save the Word document.

Refer our UG documentation link to know more about replacing bookmark content in a Word document,
https://help.syncfusion.com/file-formats/docio/working-with-bookmarks#replacing-content-in-a-bookmark

Refer our UG documentation link to know more about removing a bookmark from the Word document,
https://help.syncfusion.com/file-formats/docio/working-with-bookmarks#removing-a-bookmark-from-word-document

If this post is helpful, please consider accepting it as the solution so that other members can locate it more quickly.



Attachment: Replacemultilinetext_11d549a6.zip

Loader.
Up arrow icon