Problems Reading Attachments from PDF

June 30, 2015, 3:07 am

≫ Next: Thumbnail generated incorrectly by pdf library

≪ Previous: Creating a PDF from PRN PCL Format file. Handling of Fonts and special characters is not working

Hello ASPOSE,

I tried the code provided at http://www.aspose.com/docs/display/pdfnet/Get+All+Attachments+from+a+PDF+Document
But no attachment was found even I can see the attachment in Adobe.
The PDF was formerly created with ASPOSE.PDF.Generator. The aspose.pdf.dll version I use is 10.2.0.

Please tell me how I can access the attachment correctly.

Regards
Gerd

PS: the code I used

      // Open document      Document pdfDocument = new Document(@"c:\temp\DownloadablesWithAttachmentsTest.pdf");      // Get embedded files collection      EmbeddedFileCollection embeddedFiles = pdfDocument.EmbeddedFiles;      // Get count of the embedded files      Console.WriteLine("Total files : {0}", embeddedFiles.Count);      // Loop through the collection to get all the attachments      foreach (FileSpecification fileSpecification in embeddedFiles)      {        Console.WriteLine("Name: {0}", fileSpecification.Name);        Console.WriteLine("Description: {0}", fileSpecification.Description);        Console.WriteLine("Mime Type: {0}", fileSpecification.MIMEType);        // Check if parameter object contains the parameters        if (fileSpecification.Params != null)        {          Console.WriteLine("CheckSum: {0}", fileSpecification.Params.CheckSum);          Console.WriteLine("Creation Date: {0}", fileSpecification.Params.CreationDate);          Console.WriteLine("Modification Date: {0}", fileSpecification.Params.ModDate);          Console.WriteLine("Size: {0}", fileSpecification.Params.Size);        }        // Get the attachment and write to file or stream        byte[] fileContent = new byte[fileSpecification.Contents.Length];        fileSpecification.Contents.Read(fileContent, 0, fileContent.Length);        FileStream fileStream = new FileStream(fileSpecification.Name, FileMode.Create);        fileStream.Write(fileContent, 0, fileContent.Length);        fileStream.Close();      }

↧

Thumbnail generated incorrectly by pdf library

July 6, 2015, 1:20 pm

≫ Next: Pdf.Kit.Lic not valid for pdf.dll

≪ Previous: Problems Reading Attachments from PDF

The thumbnail generated from the code below does not match the pdf. Run the attached code and extract the thumbnail from example_pdf.pdf. example_thumb.jpg illustrates the incomplete rendering.

privatestaticvoid MakeThumbnail(string filePath, string thumbPath)

{

//Load the document

Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(filePath);

using (FileStream imageStream = newFileStream(thumbPath, FileMode.Create))

{

//Create PNG device with specified resolution

Aspose.Pdf.Devices.Resolution resolution = new Aspose.Pdf.Devices.Resolution(300);

Aspose.Pdf.Devices.JpegDevice jpgDevice = new Aspose.Pdf.Devices.JpegDevice(resolution, 100);

//Convert page 1 and save the image to stream

jpgDevice.Process(pdfDocument.Pages[1], imageStream);

//Close stream

imageStream.Close();

}

↧

Pdf.Kit.Lic not valid for pdf.dll

July 6, 2015, 1:58 pm

≫ Next: Unhandled Exception: System.ArgumentException: Invalid index in Rows indexer: -1

≪ Previous: Thumbnail generated incorrectly by pdf library

We are trying to upgrade to the Aspose.PDF.dll from the Aspose.PDF.Kit.dll that we have been using for years.

I have referenced the new dll, but when I try Aspose.SetLicense with our old kit license, I receive the exception: "The license is not valid for this product"

I had read on the Aspose site that the Kit license should still work for the new merged dll.

Is that not accurate?

Thanks.

↧

Unhandled Exception: System.ArgumentException: Invalid index in Rows indexer: -1

June 10, 2015, 5:35 am

≫ Next: Extracted Signature Images Are in Black and White

≪ Previous: Pdf.Kit.Lic not valid for pdf.dll

The exception is occurring in our production environment sporadically since we upgraded to Aspose.Pdf 9.3.0.0 from a very old version. (This exception is also easily reproducible in 10.0.5.0). Full stack trace and a snippet of .net code is below. Sample data file is attached. Removing random rows form the sample file causes the error to go away and the output to successfully generate. Please help. Thanks

new Aspose.Pdf.License().SetLicense("Aspose.Pdf.lic");var pdf = new Aspose.Pdf.Generator.Pdf();using (var file = File.OpenRead("56A.xml"))
{
	pdf.BindXML(file, null);// uncomment to see the exception//pdf.GetBuffer();// no exception, no output either
	pdf.Save("output.pdf");
}

Unhandled Exception: System.ArgumentException: Invalid index in Rows indexer: -1

   at Aspose.Pdf.Generator.Rows.get_Item(Int32 index)
   at ?♥‼.↑♦♫.??☼(Pdf doc, Section currentPart, Table table, Single availableHei
ght, Single& breakAreaHeight, Boolean rowInNewPage)
   at ?♥‼.↑♦♫.¶?☼(Pdf doc, Section currentPart, HeaderFooter hf, Table table, ←♥
♫ assignInfo, ←♥♫ bakAssignInfo, ♣♦♫ useType, Boolean isFirst, ↓♦♫& breakTableNe
xtPart)
   at ?♥‼.→♦♫.??☼(Pdf doc, Section currentPart, HeaderFooter hf, Table table, ←♥
♫ assignInfo, ♣♦♫ useType, Boolean isFirst, ↓♦♫& breakTableNextPart)
   at ?♥‼.◄♦♫.??☼(Pdf , Section , ←♥♫ )
   at ?♥‼.?♥♫.§?☼(Pdf )
   at ?♥‼.?♥♫.??☼(§♠♫ , Pdf )
   at Aspose.Pdf.Generator.Pdf.GetBuffer()

↧

Extracted Signature Images Are in Black and White

June 16, 2015, 7:27 am

≫ Next: ADA Compliance

≪ Previous: Unhandled Exception: System.ArgumentException: Invalid index in Rows indexer: -1

Hello,

When I extract signature images from a pdf, the resulting images are in black and white. For example, if the image is blue with transparent background, the result would be a white image with a black background. Same for black signatures where the background becomes black and the signature becomes white. Note that the original signatures are in Png format.

This is the code that I'm using to extract the signatures:

int count = 0;

using (Document pdfDocument = new Document(pdfPath))

{

foreach (Field field in pdfDocument.Form)

{

SignatureField sf = field as SignatureField;

if (sf != null)

{

count++;

String outFile = String.Format("{0}\\Signature Image-{1}.png", destinationPath, count);

using (Stream imageStream = sf.ExtractImage())

{

if (imageStream != null)

{

using (System.Drawing.Image image = Bitmap.FromStream(imageStream))

{

image.Save(outFile, ImageFormat.Png);

}

Is there any way to keep the colors as they were?

Thank you.

↧

ADA Compliance

July 7, 2015, 7:59 am

≫ Next: PDF Encoding

≪ Previous: Extracted Signature Images Are in Black and White

We are using Aspose.Pdf to convert tiff images to PDF for archival purposes. We have a need to ensure PDF/A and ADA compliance. We have implemented PDF/A - seems like ADA compliance is a bit more tricky. Can Aspose.PDF generate ADA compliant documents? Thanks

↧

PDF Encoding

April 2, 2014, 8:34 pm

≫ Next: HTML to PDF Conversion Error

≪ Previous: ADA Compliance

Hi,
I try to read a UTF-8 file with Khmer characters and try to out it as a PDF using Aspose PDF. However, the output PDF is blank. May I know is Aspose support Khmer character ? The following is my code:

import java.io.File;
import java.nio.charset.Charset;
import aspose.pdf.*;
import org.apache.commons.io.FileUtils;

public class AsposePdf {

    public static void main(String[] args) throws Exception {
        new AsposePdf().createPdf();
    }

    public void createPdf() throws Exception {
        // Instantiate Pdf object by calling its empty constructor
        Pdf pdf = new Pdf();

        // Create a new section in the Pdf object
        Section sec = pdf.getSections().add();

        final String hello = FileUtils.readFileToString(new File("hello.txt"), Charset.forName("UTF-8"));

        // Create a new text paragraph and pass the text to its constructor as argument
        Text text = new Text(sec, hello);

        text.getTextInfo().setFontName("Arial Unicode MS");

        sec.getParagraphs().add(text);

        pdf.setUnicode();
        pdf.save("hello.pdf");
    }
}

Regards,
Cheong

↧

HTML to PDF Conversion Error

June 30, 2015, 7:09 am

≫ Next: superscript and apostrophe

≪ Previous: PDF Encoding

I am using Aspose Total for Java SDK to convert files.
When converting the attached Html file to pdf, I am getting the following error:

class com.aspose.pdf.internal.p344.z10: Value cannot be null.
Parameter name: rawHtml
com.aspose.pdf.internal.p225.z2.m1(Unknown Source)
com.aspose.pdf.internal.p211.z4.m2(Unknown Source)
com.aspose.pdf.internal.p211.z5.m1(Unknown Source)
com.aspose.pdf.internal.p211.z4.m3(Unknown Source)
com.aspose.pdf.internal.p211.z4.m3(Unknown Source)
com.aspose.pdf.internal.p211.z4.m1(Unknown Source)
com.aspose.pdf.internal.p211.z4.m1(Unknown Source)
com.aspose.pdf.internal.p214.z1.m1(Unknown Source)
com.aspose.pdf.z30.m1(Unknown Source)
com.aspose.pdf.ADocument.m1(Unknown Source)
com.aspose.pdf.ADocument.<init>(Unknown Source)
com.aspose.pdf.Document.<init>(Unknown Source)

Please let me know how to resolve this issue.
Thanks.

↧

superscript and apostrophe

May 27, 2015, 4:02 pm

≫ Next: html To pdf conversion - Embed Font

≪ Previous: HTML to PDF Conversion Error

Please check attached pdf file, I'm trying to save a pdf as txt and some characters are not getting replaced correctly

the superscript 10^2 in the pdf is getting converted to a dash (-) and the apostrophe is getting converted to a dash as well. Please advise

↧

html To pdf conversion - Embed Font

July 7, 2015, 10:12 am

≫ Next: Last page of PDF will not print

≪ Previous: superscript and apostrophe

I am evaluating Aspose for support for Khmer, and if it is fit for use for our purpose

We need to convert html into pdf. The html contains styles to embed fonts as below

<html>
<head>
<style type="text/css">

@font-face{
font-family: 'KhmerOS';
src: url("KhmerOS.ttf");
-fs-pdf-font-embed: embed;
-fs-pdf-font-encoding: Identity-H;
}
@font-face{
font-family: 'KhmerOSmuollight';
src: url('KhmerOSmuollight.ttf');
-fs-pdf-font-embed: embed;
-fs-pdf-font-encoding: Identity-H;
}

    .khmertext {
font-family: "Khmer OS","KhmerOS", "KhmerOSmuollight",sans-serif;
}

</style>
</head>
<body>

<div class="khmertext">This is Khmer = វិញ្ញាបនប័ត្ររបស់កេ</div>

</body>
</html>

And I am trying to convert it using the following code

    String basePath = "c:\\basePath\\";
    com.aspose.pdf.HtmlLoadOptions htmloptions = new     com.aspose.pdf.HtmlLoadOptions(basePath);
        String basePath = "c:\\basePath\\";
    com.aspose.pdf.HtmlLoadOptions htmloptions = new com.aspose.pdf.HtmlLoadOptions(basePath);
//    htmloptions.setInputEncoding("UTF-8");
//    TextState txtState = new TextState();
//    txtState.setFont(FontRepository.openFont(basePath + "KhmerOS.ttf"));
//    htmloptions.getPageInfo().setDefaultTextState(txtState);

    // Load HTML file
    com.aspose.pdf.Document doc = new com.aspose.pdf.Document(basePath+"test.html", htmloptions);

// Save HTML file
    doc.save("c:\\basePath\\result.pdf");

Is something like this even possible using Aspose? Given that the font has been embedded in the html, ideally there should be no need to specify the font again in java at the time of conversion. Even if I do it doesn't work

Is there an example that I can look at?

Thanks

↧

Last page of PDF will not print

July 7, 2015, 11:36 am

≫ Next: Getting message "At most 4 elements (for any collection) can be viewed in evaluation mode." but my license is still good till next year

≪ Previous: html To pdf conversion - Embed Font

Hi, I am evaluating ASPOSE to use for printing existing PDF files using asp.net. I can print the file just fine with the exception of the last page. The last page of each file will not print. Aspose.Pdf.Facades.PdfViewer.Pagecount reports the correct number of pages. Please see the code below: { string fname = @"\\is-apps02\RPM\Working\Admin\g_payroll.pdf"; Aspose.Pdf.Facades.PdfViewer pdfv = new Aspose.Pdf.Facades.PdfViewer(); System.Drawing.Printing.PageSettings pgs = new System.Drawing.Printing.PageSettings(); System.Drawing.Printing.PrinterSettings prin = new System.Drawing.Printing.PrinterSettings(); pdfv.BindPdf(fname); int pgcnt = 0; pgcnt = pdfv.PageCount; txt2.Text = pgcnt.ToString(); prin.PrintRange = System.Drawing.Printing.PrintRange.AllPages; prin.PrinterName = cmbxPrinterNames.SelectedValue.ToString(); prin.DefaultPageSettings.PaperSize = new System.Drawing.Printing.PaperSize("letter", 827, 1169); pgs.Margins = new System.Drawing.Printing.Margins(0, 0, 0, 0); pgs.PaperSize = prin.DefaultPageSettings.PaperSize; pdfv.PrintDocumentWithSettings(pgs, prin); pdfv.Close(); } Any suggestions?

↧

Getting message "At most 4 elements (for any collection) can be viewed in evaluation mode." but my license is still good till next year

July 7, 2015, 1:37 pm

≫ Next: multi-column section followed by another section on same page crashes during save

≪ Previous: Last page of PDF will not print

Hello. I am receiving the following message out of nowhere starting today. "At most 4 elements (for any collection) can be viewed in evaluation mode." The software has been working fine for months, and just now it says this. My license file says the expiration date is 20160114, so it should not be trying to use evaluation mode. I am using Aspose.Pdf version 1.5 and have always been using it from the start. What could have happened to cause this?

↧

multi-column section followed by another section on same page crashes during save

December 2, 2014, 4:32 pm

≫ Next: The height of split table

≪ Previous: Getting message "At most 4 elements (for any collection) can be viewed in evaluation mode." but my license is still good till next year

When creating a pdf with a section that has multiple columns, followed by another section which is not on a new page the Save call crashes.

var pdf = new Pdf();
var colSection = pdf.Sections.Add();
colSection.ColumnInfo.ColumnCount = 2;
colSection.Paragraphs.Add(new Text("test"));
var anotherSection = pdf.Sections.Add();
anotherSection.IsNewPage = false;
pdf.Save(new MemoryStream()); //System.ArgumentNullException : Value cannot be null. Parameter name: fontInfoProvider

Aspose.Pdf version 9.8.0.0

↧

The height of split table

July 7, 2015, 12:31 am

≫ Next: SetLicense in WCF service

≪ Previous: multi-column section followed by another section on same page crashes during save

Hello,

I need to get a real table height. I used to use table.GetHeight API method, but sometimes a table may be split with a repeated header. In this case the height of the header is ignored by your API. In other words the API returns the same result no matter if a table is split or it's not.

Could you please advise how to calculate the table height properly in this case.

↧

SetLicense in WCF service

July 16, 2015, 10:31 am

≫ Next: Your Ticket # PDFNEWNET-38945

≪ Previous: The height of split table

Hi,

We are using Aspose.PDF.Dll (version 10.2.0.0) in a WCF service. Once in a while, we get "Object reference not set to an instance of an object." at line Aspose.Pdf.License.SetLicense(String licenseName) and the WCF service stops working. We have to reset App pool for the WCF service to make it working again.

The WCF service is configured as

[ServiceBehavior(ConcurrencyMode = ConcurrencyMode.Multiple, InstanceContextMode = InstanceContextMode.PerCall)] , It has a class that calls the SetLicense, the license file is included in the project as embedded resource.

license.SetLicense(licenseFileName);

license.Embedded = true;

What's the best way to avoid this error? It only happens 1% of the time.

Thank you and waiting for your response.

↧

Your Ticket # PDFNEWNET-38945

July 16, 2015, 11:09 am

≫ Next: Wrapping of Text when doing Search and Replace

≪ Previous: SetLicense in WCF service

Hello there,

The above ticket was generated on a previous post by me under another login. This is the login of my client and as they have purchased your Aspose.pdf to support the application I developed in my own office using your trial version, it seems appropriate that your ticket be associated with this profile rather than my own. Here is the text of my original post and the attachments I provided at that time.

Thank you

____________________________________________________

Hello there,

I have attached a copy of the printed PDF and the file version. As you can see there are data fields in the print copy that are not formatted correctly and others that are blank.

The data is coming from an OLEDB connection to an SQL Server. I have included my code also.

The Original PDF we are filling was created using LiveCycle Designer 11.0, PDF version is 1.7, Adobe Extension Level 3 (Acrobat 9.x).

Can you shed some light on this?

Thanks

Mike

↧

Wrapping of Text when doing Search and Replace

July 11, 2011, 6:55 am

≫ Next: Image to Searchable PDF

≪ Previous: Your Ticket # PDFNEWNET-38945

Hello,

I am trying to replace text in PDF files using the following code. The problem is that the text does not wrap if the line becomes too long due to the replacement text. Is there a way to handle this? I have attached sample input and output files to illustrate the problem.

            Document pdfDocument = new Document("input.pdf");
        TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber("rinse");
        pdfDocument.Pages.Accept(textFragmentAbsorber);

        TextFragmentCollection textFragmentCollection = textFragmentAbsorber.TextFragments;

        foreach (TextFragment textFragment in textFragmentCollection)
        {
            textFragment.Text = "LONG REPLACEMENT TEXT";
        }

        pdfDocument.Save("output.pdf");

- Sumit

↧

Image to Searchable PDF

April 6, 2015, 6:38 am

≫ Next: TextFragmentAbsorber using Regular Expression not spanning multiple pages.

≪ Previous: Wrapping of Text when doing Search and Replace

I'm trying to convert an image to a searchable PDF.

To do this I use tesseract to make OCR.

The problem is that the Convert method never call the CallbackGetHocr method ..

Here is the code:

private string convertToSearchablePDF(string imagePath)

{

var input = this.imageToPdf(imagePath);

var job = new OcrJob();

var imgPath = imagePath;

using (var doc = new Document(input))

{

doc.Convert(img =>

{

var hocr = job.RunHOCR(new Bitmap(img));

File.WriteAllText(imgPath + ".hocr.html", hocr);

return hocr;

});

var output = imagePath + ".output.pdf";

doc.Save(output);

return output;

}

The Convert method correctly call the callback and hocr is returned. I write the HOCR on the file so that I can check the content.

But then the generated PDF is not searchable ... No error is triggered.

I've put some files in attachments :

- invoice13.jpg: the original image

- invoice13.jpg.pdf: the pdf created from the image (non-searchable) by the imageToPdf method

- invoice13.jpg.hocr.html: the output of tesseract ocr

- invoice13.jpg.output.pdf : what should be a searchable PDF

↧

TextFragmentAbsorber using Regular Expression not spanning multiple pages.

July 16, 2015, 3:21 pm

≫ Next: TextFragmentAbsorber Only Getting First 4 Fragments

≪ Previous: Image to Searchable PDF

I'm trying to extract text out of PDF based on regular expression and it seems to be working for most part but I have encountered a strange behavior.

If the text that I'm looking for spans across multiple pages does TextFragmentAbsorber look at this as continuous text? If looks like it stops at the end of page 1 even though I indicated all pages. In fact it picked up text from bottom of first page that meets my regular expression and then paragraph from top of the first page all in single TextFragment.

Below is the section of the code for your reference and I've attached the complete CS code and pdf file being used to test this. I was expecting text from page 3 to be captured as well since I would like to incorporate the "Recipe Tip" into my regular expression.

//DIRECTIONS
//Create TextAbsorber object to extract text
TextFragmentAbsorber textFragmentAbsorberDirections = new TextFragmentAbsorber("Directions(\r\n|\r|\n)[0-9a-zA-Z(\r\n|\r|\n)°@#$%&*+\\-_(),+':;?.,!\\[\\]\\s\\/ è]*");

//Set text search option to specify regular expression usage
TextSearchOptions textSearchOptionsDirections = new TextSearchOptions(true);
textFragmentAbsorberDirections.TextSearchOptions = textSearchOptionsDirections;

//Accept the absorber for all the pages
pdfDocument.Pages.Accept(textFragmentAbsorberDirections);

//Get the extracted text from first fragment
Console.WriteLine("{0}", textFragmentAbsorberDirections.TextFragments[1].Text);

↧

TextFragmentAbsorber Only Getting First 4 Fragments

July 16, 2015, 2:13 pm

≫ Next: Setting a background image

≪ Previous: TextFragmentAbsorber using Regular Expression not spanning multiple pages.

Hi,

I'm trying to extract all of the text from a PDF page so that I can change the text's color with Aspose 10.6.0. I'm using the TextFragmentAbsorber, but no matter what PDF I use it's only returning me the first 4 text fragments in its Results View - the rest of the text fragments I can see in its Non-Public Members but I can't figure out a way of accessing them...

Here's how I'm using TextFragmentAbsorber to extract the text:

var page = pdfDoc.Pages[pageNum];// get a page from the PDF doc

var tfa = new TextFragmentAbsorber();// create the TextFragmentAbsorber

page.Accept(tfa);

var tfc = tfa.TextFragments;// get text fragments for page

foreach (TextFragment tf in tfc)// loop through each text fragment and set it's color to black

{

tf.TextState.ForegroundColor = Color.Black;

}

However, even though tfa.TextFragments above has a count of 13 for this particular PDF page, only its Non-Public Members contain all 13 fragments, its Result View only contains the first 4 fragments (see attached screen shot).

Am I using TextFragmentAbsorber incorrectly? Or should I be using another strategy to change the text color for this PDF page?

Thank you,
-Michael

↧