I’ve been developing software that counts the number of pages in multiple PDFs. But one issue I came across is determining if a PDF is corrupt. I’ve opened a multipage PDF and scrolled to the end and deleted a small length of content; however, Aspose.PDF still counts an accurate page count even though Adobe Reader cannot open the file.
Now I do have a bit of code to check during the count but it slows down timing 10 fold. Is there a fast-ish way to tell if the PDF is corrupt just after opening the document?
Basically my code below tells me if the PDF is potentially corrupt:
private int countPDFPages(string fullPath)
{
int pageCount = 0;
try
{
using(Document doc = new Document(fullPath))
{
pageCount = doc.Pages.Count;
// this loop helps detect a bad PDF but really slows down the processing
// need to find a better validation
for(int i = 1; i <= pageCount; i++)
{
int discard = doc.Pages[i].Contents.Count;
}
}
}
catch(Exception ex)
{
pageCount = -2;
}
return pageCount;
}