Quantcast
Channel: Adobe Community: Message List
Viewing all articles
Browse latest Browse all 97529

Split PDF based on content into different pdfs with custom file name but not all page have the identifier

$
0
0

Hi would really appreciate advice and expertise on a task I will like to do described below but given that I have very little expertise in Javascript but have Acrobat Pro DC, I am stuck.

 

I have a report that contains multiple reports in a single pdf which I want to split it into individual reports. The cover page of each report has a report number XPD Report - nnn. I want to be able to search for a specific string ("XPD Report") within the pdf, and then save the sequence of numbers that come after that string for my file name ("001","002"...). I would want to extract the cover page and the succeeding pages of the XPD report and include it into one pdf until it finds another unique XPD Report.

 

For example,

 

page 1    XPD Report - 001

page 2   

page 3    XPD Report - 002

page 4   

page 5    XPD Report - 003

page 6   

page 7

 

Using this example, pages 1 and 2 would be extracted into one pdf together. Page 3 and 4 would be extracted by itself, and pages 5, 6 and 7 would be extracted into one pdf.

 

I was trying to replicate as much as possible what I could find of the codes in the forum link below and running as an Action in Action Wizard, but to no avail.

 

https://forums.adobe.com/thread/2502247?q=Split%20PDF%20based%20on%20content%20but%20not

 

 

  1. var curDoc = app.activeDocs[0]; 
  2. var pageArray=[]; 
  3. var repeat = 0
  4. var dataCode = ""
  5. var startPage = pageArray[0]; 
  6. var startPageNumber = 0
  7. var lastPageNumber = curDoc.numPages; 
  8. lastPageNumber--; 
  9.  
  10.  
  11. // This part gets all the page numbers from the document as before 
  12. for (var p = 0; p < curDoc.numPages; p++) 
  13.     for(var n = 0; n< curDoc.getPageNumWords(p); n++) 
  14.     { 
  15.        if(curDoc.getPageNthWord(p,n)=="XPD REPORT -"
  16.        { 
  17.             dataCode=curDoc.getPageNthWord(p,n+1) ; 
  18.             pageArray.push(dataCode); 
  19.             break
  20.        } 
  21.     } 
  22.  
  23. // This bit has been refactored to stop the need to go through all the pages again 
  24. // it also uses the ability of insertPages to insert more than one page at a time. 
  25. for ( var i = 1; i < pageArray.length; i++) 
  26.     var endPageNumber = i - 1
  27.      
  28.     // if we have a match, AND we are not the last page, keep going 
  29.     if (( startPage === pageArray[i]) && ( i !== lastPageNumber)) 
  30.     { 
  31.         exportFile = false 
  32.     } 
  33.     // if we are the last page, we don't care about a match anymore. 
  34.     elseif ( i === lastPageNumber) 
  35.     { 
  36.         // catch if we are at the end of the document 
  37.         exportFile = true
  38.         endPageNumber = i; 
  39.     } 
  40.     // we are not the last page, and we are not a match for the pages we are looking for 
  41.     else 
  42.     { 
  43.         // catch when we have passed the current page 
  44.         exportFile = true
  45.  
  46.     } 
  47.     // once we have some files to process. 
  48.     if ( exportFile) 
  49.     { 
  50.         d = app.newDoc(); 
  51.         // call insert pages once with the page range to insert. 
  52.         d.insertPages ( 
  53.         { 
  54.             nPage: d.numPages -1
  55.             cPath: curDoc.path, 
  56.             nStart: startPageNumber, 
  57.             nEnd : endPageNumber, 
  58.         }); 
  59.         // remove initial page 
  60.         d.deletePages(0); 
  61.         // set up for the next run 
  62.         startPage = pageArray[i]; 
  63.         startPageNumber = i; 
  64.     } 

Viewing all articles
Browse latest Browse all 97529

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>