Working with Alignments in Simple Engine Mode

The SIMPLE engine mode focuses on preserving the render layout of the original PDF report as it is converted to monospaced text.  The following table describes the properties that can be modified under this mode. 

Property

Description

Auto Adjust

This button instructs Data Prep Studio to automatically select the optimum settings for the displayed sample page. Note that if you have changed any of the PDF import settings, clicking this will likely restore the original settings.

Monospaced (for PDF Engine versions 4.4 and below)

This setting specifies that a monospaced font (i.e., a fixed-width or non-proportional one) was used in the PDF file. Monospaced fonts are fonts in which each character has the same width. For example, in a monospaced font, the "o" and "i" characters would have the same width, i.e., they would take up the same amount of horizontal space on a line. Other terms for monospaced are fixed-width and non-proportional. The opposite of monospacing is proportional spacing, in which different characters have different widths, e.g., in a proportionally spaced font, the letter "o" would be wider than the letter "i".

When you import a PDF file into Monarch Data Prep Studio, the application tries to detect when monospaced fonts are used and optimizes the conversion accordingly. In some cases, Monarch Data Prep Studio may not detect that monospaced fonts were used for the PDF file. When this happens, it is usually due to a mix of monospaced and proportional fonts existing in the same PDF file. If you know that the PDF file uses monospaced fonts, but the fonts are not displaying correctly, you can select this setting to force Monarch Data Prep Studio to optimize for Monospaced fonts. While proportionally spaced fonts look more appealing, monospaced fonts are superior for tabular data because the uniform width of each character makes alignment of columns easier.

In general, PDF files generated using monospaced fonts will convert more successfully, so if you are trying to optimize your PDF producing application for Monarch Data Prep Studio, use monospaced fonts. Some of the more common ones are: Andale Mono, Anonymous, Crystal, Bitstream Vera Sans Mono, Courier, Courier New, Elronet Monospace, Everson Mono Latin 6, Fixedsys, Lucida Sans Typewriter, Lucida Console, and PrestigeFixed.

Free-form (for PDF Engine versions 4.4 and below)

This option tries to optimize text that is more free-form than columnar or grouped columnar text. A columnar document is a simple table format, while grouped columnar might be something similar to one of the Monarch Data Prep Studio sample reports, such as Betty’s Music Store (Classic.pdf). A typical document that might benefit from using this setting would be an academic report that is 95% text, but which also contains a few tables that you want to extract. Note: This setting will sometimes work effectively on columnar documents when the default settings are not producing a good result.

Snap Text Left (PDF Engine version 4.5)

This setting aligns the text to the left of the imputed PDF grid.

Snap Text Up (PDF Engine version 4.5)

This setting aligns the text to the top of the imputed PDF grid.

Always Align Left (PDF Engine version 4.5)

This setting instructs the application to always align text the to left of the imputed PDF grid.

Suppress Left Whitespaces (PDF Engines v4.2–4.5)

This setting instructs the application to remove all left-side white spaces when displaying the report.

Stretch

This setting governs how much space is used during the conversion process. When Monarch Data Prep Studio analyzes the PDF file, it tries to match the spacing as far as possible to the original document, but there are many factors that can make it necessary to introduce more spacing into the conversion than appears to exist in the original PDF file. Such factors can include hidden data in the PDF file, i.e., data which is not visible on screen but still exists within the PDF file itself. This can be the result of columns that truncate the data, for example. At first glance, it is not apparent that any data is missing, but Monarch Data Prep Studio will convert all the data in the PDF file, not just what might be visible in a PDF viewing application. In this case, in order to try and maintain a proper column justification, Monarch Data Prep Studio will have to recalculate and pad the spacing, as the original column spacing would not be enough to hold the data safely.

In general, Monarch Data Prep Studio uses a larger amount of spacing than in the PDF file. When viewed in the Report window, this will make the document look like it is stretched wider than the original PDF file, but Monarch Data Prep Studio errs on the side of caution so that columns won’t run into each other. This is also done so that if a later iteration of the same report (or a similar one) contains wider data values, the model will likely still work with it.

If you know your reports well, then you can decrease the stretch value to make the reports look more presentable, thereby avoiding very small font sizes in the Report Window or the necessity of horizontal scrolling.

Use the + and - buttons provided to specify a stretch value.

Crop

This setting instructs the application to crop extra space from the PDF page. Use the + and - buttons provided to specify a crop value.

 

 

 

Related Links

 

 

© 2024 Altair Engineering Inc. All Rights Reserved.

Intellectual Property Rights Notice | Technical Support