[PDF] Merge all annotations type

March 20, 2015, 8:59 am

≫ Next: (M)HTML to PDF conversion does not support Unicode

≪ Previous: Cell formatting does not appear correctly if rowspan too large

Hello,

Today, we have a service that merges all annotations from several identical pdf files, into one pdf file, with all annotations.

We need to take into account also the free hand drawing annotation type.

Please find enclosed the existing code for merging all annotations, and an example of PDF files to merge.

Thank you very much.

Best regards.

Alexis.

↧

(M)HTML to PDF conversion does not support Unicode

July 7, 2015, 1:07 am

≫ Next: How shoud I extract table Data from PDF files using Aspose PDF

≪ Previous: [PDF] Merge all annotations type

Hello,

I'm trying to convert a .mht/.html file with Aspose.Pdf 10.5.0. These files may contain language specific letters (eg. German umlauts). The resulting PDF will either remove these characters or replace them with question marks.

HTML to PDF (as suggested in a bug report):

var pdf = new Pdf
{
  HtmlInfo =
  {
    CharSet = "UTF-8",
    CharsetApplyingLevelOfForce = HtmlInfo.CharsetApplyingForceLevel.EnforceUseAlways
  }
};
pdf.SetUnicode();
var section = pdf.Sections.Add();
var text = new Text(section, htmlString)
{
  IsHtmlTagSupported = true,
  IsHtml5Supported = true,
  TextInfo = {FontName = "Arial Unicode MS"},
  IfHtmlTagSupportedOverwriteHtmlFontNames = true
};
text.TextInfo.IsFontEmbedded = true;
section.Paragraphs.Add(text);
pdf.Save(pdfOutputPath);

MHTML to PDF:

using (var document = new Document(mhtmlFile, new MhtLoadOptions()) { PageInfo = { Margin = new Aspose.Pdf.MarginInfo(25, 20, 25, 25) } })
{
  document.Save(pdfOutputPath, SaveFormat.Pdf);
}

Note: I couldn't find any way to add Unicode support for the 'Aspose.Pdf.Document.Document' class. Should it be auto-detected or is it missing?

I prefer the second (MHTML to PDF) approach.

↧

How shoud I extract table Data from PDF files using Aspose PDF

January 23, 2014, 12:55 pm

≫ Next: Apply rotation on existing content

≪ Previous: (M)HTML to PDF conversion does not support Unicode

Hi All,

I would like to extract table data from PDF files. Do you know which method I should use in order to extract table data from PDF?

Thanks in Advance.

Min

↧

Apply rotation on existing content

October 8, 2015, 7:25 am

≫ Next: Aspose.PDF for Cloud confuses two separate characters with one single character

≪ Previous: How shoud I extract table Data from PDF files using Aspose PDF

Hi,

Is it possible to apply a rotation to all the content inside a page given an arbitrary angle (using a Matrix). If so, how am i suppose to proceed?

What i need is to rotate the content itself, not the page. So using page.setRotation() doesn't suits me.

Regards

↧

Aspose.PDF for Cloud confuses two separate characters with one single character

September 29, 2015, 7:48 am

≫ Next: Abort PDF Save thread

≪ Previous: Apply rotation on existing content

We are using Aspose.PDF for Cloud for parsing and populating PDF templates. The service has a bug where it detects the two separate characters 'f' & 'i' (fi) as the one single character 'ﬁ' and the population fails. You can easily replicate this by parsing and populating a PDF template with these two characters.

↧

Abort PDF Save thread

October 13, 2015, 3:14 am

≫ Next: aspose.pdf.document to aspose.words.document

≪ Previous: Aspose.PDF for Cloud confuses two separate characters with one single character

Hi,

I would like to know if there's anyway we could kill the thread even if it is saving a pdf document, and Aspose would release the created pdf.

Can InterruptionToken do it?

My code in working thread:

Aspose.Pdf.Generator.Section sec = pdfFile.Sections.Add();

image = Aspose.Pdf.Generator.Image.FromSystemImage(new Bitmap(srcFile));

// do something about image

sec.Paragraphs.Add(image);

pdfFile.Save(outputFile);

Thanks

↧

aspose.pdf.document to aspose.words.document

October 13, 2015, 4:26 am

≫ Next: Error converting PDF to PDF/A-1b

≪ Previous: Abort PDF Save thread

I have a Aspose.Pdf.Document I would like to change to a Aspose.Words.Document

How can I do this the easy way?

↧

Error converting PDF to PDF/A-1b

October 13, 2015, 5:49 am

≫ Next: Background Image CSS Property

≪ Previous: aspose.pdf.document to aspose.words.document

Hello,

Please try to convert attached document to PDF/A-1b

Conversion fails with exception.

An unhandled exception of type 'System.InvalidCastException' occurred in Aspose.Pdf.dll

Additional information: Unable to cast object of type ' . ' to type ' . '.

Reproduced with Aspose.PDF 10.9.0.0.

Thanks in advance for your help,

Liza

↧

Background Image CSS Property

October 5, 2015, 2:40 am

≫ Next: PDF Concatenate not working

≪ Previous: Error converting PDF to PDF/A-1b

Does Aspose work with inline style for background image.

Refer to the attachment (html snippet as well as the image used).
1. HTML displays content inside the shape "circle.png" on browser.
2. When we try to use Aspose, the shape does not show up.
3. If I change the background-size to Cover, it tries to draw the image and does not fit the box.

My concern is around:
background-image: http://localhost/GEMSUI/Images/Charting/circle.png") !important; background-repeat: no-repeat; background-size: 100% 100%;

In Browser the image will scale to DIV size but in html it does not.

If I use background-size: cover; it gives partial image.

↧

PDF Concatenate not working

October 7, 2015, 10:51 am

≫ Next: JPEG 2000 to PDF

≪ Previous: Background Image CSS Property

I am having issues with ASPOSE.PDF for Java getting the concatenate feature to work properly. I have it coded like the sample in your documentation, but it keeps throwing an error when I get to the line for add the pages to the primary document. I am using a inputstream which the PDF file data and I am able to get the page information prior to joining the two files together, but it just fails. Here is the message that I get:

main::com::aspose::pdf::internal::ms::System::z87=HASH(0xa77e004)

I do get the page info before this so I know it is seeing the data, but as soon as I add the line

pdfDocument1.getPages().add(pdfDocument2.getPages()); it will throw the above error at me. Here is the full code for review

public String doPdfFileCombine (String Content1, String Content2) throws Exception{

// Convert the string into an array of bytes and pass it into a new memory stream

InputStream pdfStream1 = new ByteArrayInputStream(Base64.decodeBase64(Content1));

// Open the first document from InputStream

com.aspose.pdf.Document pdfDocument1 = new com.aspose.pdf.Document(pdfStream1);

// Convert the second string into an array of bytes and pass it into a new memory stream

InputStream pdfStream2 = new ByteArrayInputStream(Base64.decodeBase64(Content2));

// Open the second document from InputStream

com.aspose.pdf.Document pdfDocument2 = new com.aspose.pdf.Document(pdfStream2);

// Create output stream

ByteArrayOutputStream outStream1 = new ByteArrayOutputStream();

// Create output stream

ByteArrayOutputStream outStream2 = new ByteArrayOutputStream();

// Save Files so you can getPages for concating

pdfDocument1.save(outStream1, SaveFormat.Pdf);

pdfDocument2.save(outStream2, SaveFormat.Pdf);

String Test = "Page Count : " + pdfDocument1.getPages().size() + "Page Count : " + pdfDocument2.getPages().size();

// Add pages of second document to the first

pdfDocument1.getPages().add(pdfDocument2.getPages());

// Save concatenated output file

pdfDocument1.save(outStream1, SaveFormat.Pdf);

Test += "\n Page Count2 : " + pdfDocument1.getPages().size() + "Page Count2 : " + pdfDocument2.getPages().size();

// Convert outStream data to Base64 String so we can pass back to Perl

//String outString = new String(Base64.encodeBase64Chunked(outStream1));

return Test;

}

Side note on this Issue we are currently using the ASPOSE.Words software to convert RTF to Doc and PDF file which is where the Content input is coming from. What we are trying to do is add additional pdf files to the converted RTF->PDF data we are currently generating. The key though is we want all the Table of Content links to still work after concatenating the files together. Is this even going to do that, we are using a different software currently that strip all the hyperlink data when combining multiple PDFs. Any info and assistance would be appreciated.

↧

JPEG 2000 to PDF

October 13, 2015, 8:27 am

≫ Next: Pdf to Word conversion - Hebrew

≪ Previous: PDF Concatenate not working

Does Aspose.Pdf for .NET support the conversion of JP2 (JPEG 2000) images?

I saw that the Java version appears to do it in a non-native manner, does .NET have the same functionality?

And what does non-native actually mean.

As an aside, it appears as though your brochure download is broken.

Thanks,

Brian

↧

Pdf to Word conversion - Hebrew

August 31, 2015, 2:01 am

≫ Next: Bug Report: PDF Text extraction takes several minutes, with 100% CPU

≪ Previous: JPEG 2000 to PDF

Hi,

I have converted the attached file from PDF to Word using the Aspose.PDF API.

The Hebrew in the resulting Word document was scrambled.

Please currect.

↧

Bug Report: PDF Text extraction takes several minutes, with 100% CPU

October 13, 2015, 9:13 am

≫ Next: Remove existing PDF security restrictions

≪ Previous: Pdf to Word conversion - Hebrew

We use Aspose for text extraction purposes only, on Java.
On some of our machines, the text extraction for a small document takes several minutes, with 100% CPU and locking other threads, whereas it is very fast on others. The reason is simple: Aspose.pdf looks for font directories in a given list. The list is the following:
"%WINDIR%/Fonts/",
"/usr/openwin/lib/X11/fonts/TrueType/",
"/usr/local/share/fonts/",
"$home/.fonts/",
"/usr/share/fonts/truetype/",
"/usr/X11R6/lib/X11/fonts/ttfonts/",
"/Library/Fonts/",
"~/Library/Fonts/",
"/Network/Library/Fonts/",
"/System/Library/Fonts/",
"~/.fonts/",
"/usr/share/fonts/",
"/usr/share/X11/fonts/TTF/",
"/system/fonts/"

But, if none of these directories exist (this is disturb-dependent), then the fallback becomes "/" ! As a result, one thread scans the full hard-drive, locking all the other...
This results in a several minutes 100% CPU activity, but everything locked.

The workaround is simple, create an empty ".fonts" directory in the home dir of the user executing the application. But I clearly think this should be considered as a bug !

↧

Remove existing PDF security restrictions

February 27, 2012, 1:00 pm

≫ Next: Not all Adobe fields are extracted in Form Fields

≪ Previous: Bug Report: PDF Text extraction takes several minutes, with 100% CPU

I need to remove the security on existing PDF documents that have the following restrictions:

owner password: yes
user password: no
open: allowed
printing: not allowed
document assembly: not allowed
page extraction: not allowed

and remove restrictions and save to new PDF with following allowed:

printing: allowed
document assembly: allowed
page extraction: allowed

When I open the existing PDF, decrypt and save to a new path I get this:

printing: allowed
document assembly: not allowed
page extraction: allowed

How do I remove the restriction on document assembly?

This is what I am doing:

string srcPath = "test.pdf";

System.IO.FileInfo fi = new System.IO.FileInfo(srcPath);

string destPath = fi.DirectoryName + "\\" + System.Text.RegularExpressions.Regex.Replace(fi.Name, fi.Extension, "_decrypted" + fi.Extension, System.Text.RegularExpressions.RegexOptions.IgnoreCase);

Aspose.Pdf.License license = new Aspose.Pdf.License();

license.SetLicense("Aspose.Pdf.TempExpires20120327.lic");

Aspose.Pdf.Document sourcePdf = new Document(srcPath);

sourcePdf.Decrypt();

sourcePdf.Save(destPath);

Thanks.

↧

Not all Adobe fields are extracted in Form Fields

October 13, 2015, 2:49 pm

≫ Next: HTML to PDF conversion

≪ Previous: Remove existing PDF security restrictions

Hi,

Environment: Libraries 10.3 and 10.9.0 (release Oct 2,2015)

I want to extract Adobe form fields and noticed that out of 68, only 35 are extracted. Why is that the case? How can I extract all 68 of them?
I attached document reg135.pdf as a use case.

I use the following code to extract location coordinates of the form fields:

            //Step 1: Get all Adobe fields.
            Aspose.Pdf.Facades.Form form = new Aspose.Pdf.Facades.Form(filename);

            Aspose.Pdf.Facades.FormFieldFacade fieldfacade = null;
            string field_value = null;

            //get all field names
            String[] allfields = form.FieldNames;
           Aspose.Pdf.PageCollection pageCollection = pdfDocument.Pages;

            foreach (string mFieldName in allfields)
            {
                try
                {
                    fieldfacade = form.GetFieldFacade(mFieldName);
                    field_value = form.GetField(mFieldName);

                    Aspose.Pdf.Page pdfPage = pageCollection[textFragment.Page.Number];

                    if (String.IsNullOrEmpty(field_value)){

                        //Break it down & form the json data
                        System.Drawing.Rectangle box = fieldfacade.Box;

                        dynamic adobe_ft = new System.Dynamic.ExpandoObject();
                        adobe_ft.page_number = fieldfacade.PageNumber - 1;
                        adobe_ft.required = form.IsRequiredField(mFieldName);
                        adobe_ft.x = box.X;
                        adobe_ft.label = mFieldName;
                        adobe_ft.y = pdfPage.MediaBox.Height - box.Y - box.Height;
                        adobe_ft.width = box.Width;
                        adobe_ft.height = box.Height;

                        mAdobeDefinedfields.Add(adobe_ft);

                        fieldfacade.Reset();// resets all visual attributes to empty value.

                    }

                } catch(Exception e){
                    if (e.Message.Contains("cannot"))
                    {
                       // Trace.WriteLine("---- Field cannot be found: "+mFieldName);
                    }
                }

Questions:
1. How can I get all 68 form fields with the code given above? Can you please suggest a solution? Or let me know if this is a bug?
2. This is follow up for another issue: http://www.aspose.com/community/forums/permalink/660958/660958/showthread.aspx#660958 and you've logged a ticket for it: PDFNEWNET-39486 to correct the height values of form fields extracted. When is the fix for this expected?

Thank you,
Sireesha

↧

HTML to PDF conversion

October 27, 2015, 4:30 am

≫ Next: XML file with Unicode Characters are not visible on pdf

≪ Previous: Not all Adobe fields are extracted in Form Fields

HTML to PDF is not working, as mentioned under known issues. By when can we expect this running properly.

↧

XML file with Unicode Characters are not visible on pdf

October 26, 2015, 4:50 am

≫ Next: Conversion PDF to DOC is not correct

≪ Previous: HTML to PDF conversion

Hello,

I have a use case here.

I'm having XML with Unicode characters like Chinese, Spanish, Greek, Telugu, Hindi etc...

And now i'm trying to convert the same XML to PDF. Here it is unable to print the Unicode characters on the PDF.

Could you please look into this ASAP.

And attached the source xml and java file for your references.

↧

Conversion PDF to DOC is not correct

October 23, 2015, 9:31 am

≫ Next: Issues with PDFFileInfo in checking Encrypted, password Protected Files

≪ Previous: XML file with Unicode Characters are not visible on pdf

Hi,

i'm trying Aspose PDF for Java because i want integrate it on my swing application but the conversion from PDF to DOC/ODT is not good for me.

I Attach a original PDF and correspondents DOC and DOCX.

Thank's

↧

Issues with PDFFileInfo in checking Encrypted, password Protected Files

October 22, 2015, 12:36 pm

≫ Next: TOC in Table over multiple pages

≪ Previous: Conversion PDF to DOC is not correct

Hi,

I am using the below code for checking encrypted and password protected files. For encrypted files isPDFFile always return fals and isEncrypted throws an error:

PdfFileInfo is not initialized. Use constructors with parameters or properties for initialization.

It always sets tru for hasOpenPassword no matter what and always throws an invalid passwordexception for editPassword even if it the one of the region of the files are protected.

Is there a way to capture all these information for a pdf file.

if(fileInfo.isPdfFile()){
       metaData.setMimeType(PDFMetadata.MIME_TYPE_PDF);
      }

      //File is Password protected for opening.
      if(fileInfo.hasOpenPassword()){
       metaData.setPasswordProtected(true);
      }

      //File is Password Protected for Editing.
      if(fileInfo.hasEditPassword()){
          metaData.setSecure(true);
        }

      //File is encrypted.
      if(fileInfo.isEncrypted()){
       metaData.setEncrypted(true);
      }

Regards,

↧

TOC in Table over multiple pages

September 16, 2015, 8:32 am

≫ Next: TextArea displaying partial data

≪ Previous: Issues with PDFFileInfo in checking Encrypted, password Protected Files

Hi,

I have a scenario where I want to add the TOC in a table and the TOC can span over multiple pages and therefore the table spans over multiple pages. Is there a way to perform that, since when I try to add the TOC to a table it just prints the last page.

Regards,

↧