Ben Langhinrichs

Photograph of Ben Langhinrichs

E-mail address - Ben Langhinrichs







Recent posts

Wed 18 Sep 2019

Perils of PDF 5: Data Confusion



Mon 16 Sep 2019

About that email in Notes



Mon 9 Sep 2019

Perils of PDF 4: Missing and obscured data


November, 2019
SMTWTFS
     01 02
03 04 05 06 07 08 09
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30

Search the weblog





























Genii Weblog

Delving deeper into your data - Part 1

Mon 10 Jun 2019, 04:58 PM



by Ben Langhinrichs
On Friday, I posted Delving deeper into your data - Intro, in which I introduced the series and its goals. I started by suggesting a series of questions or queries we might want to answer that all go deeper than you can with DQL or selection formulas or views. My purpose is to explain more about how the Midas LSX can answer these sorts of questions, and how it can provide the answers in convenient JSON or XML form. The first question: 
 
How would I find all the product images in our offerings database that do not have corporate's dictated 1.91:1 aspect ratio or that are under 450 pixels wide?
 
To answer this, we have to iterate through the documents. While we could narrow down that set using a view or selection formula or DQL, I'm going to assume we are checking all the documents. The Midas part comes in when we go inside the document and inside one or more rich text fields to iterate through the images. Take a look at the Export Directive from our Export to JSON sample db, a data driven way of harnessing the Midas engine without having to write any code. The numbers below match the numbers of the image below:
 
1) Select all the documents (this is where we could use various criteria including DQL.
2) Specify what values we want to show up in the JSON. Both methods for getting a chunk property are used.
3) We have chosen to split the result by chunk, so we have to specify the rich text field and the target type. We will use 'Graphic' which refers to any image. See the Midas 101 - Chunk Definitionspost for more details - the fact that it was written 16 years ago should give you an idea of how durable the chunk concept has proven to be.
4) The chunk filter formula is like a selection formula for chunks. If it evaluates to True, the image is included and a record is written to JSON for this result. If not, it is skipped.
5) The JSON format to use is specified. There is a Midas-defined default format, but you can change to MongoDB or Salesforce or any of the others.
 
Inline JPEG image
 
Now, you may have noticed that in my chunk selection formula, I did basically the opposite of the question. That is because I don't have a product database with images, so instead I used one of the old Business Partner forum databases from 2007, and I will look for any images in the 13000+ documents in that db to see if there are images that are in the correct range and over the specified size. It turns out, there are six of them. See the JSON result set below. This took about 10 seconds, though there are lots of optimization details I ignored for the sake of this demo.
 
I'm going to leave this here without a lot more discussion, but please don't hesitate to ask in the comments or by email if you want clarification or are curious about the features. By the way, the actual chunk filter formula to answer the original question would be @ChunkNum(GraphicRatio) != 1.91 | @ChunkNum(GraphicWidthPX) < 450, but I'm guessing you figured that out for yourself.
 
Inline JPEG image
 

Copyright © 2019 Genii Software Ltd.

What has been said:

No documents found

Have your say:

Name *:
E-mail:
e-mail addresses will not be displayed on this site
Comment *:


<HTML is not allowed>
Linking: Add links as {{http://xxx|title}}, and they will be activated once approved
Blocked? Unable to post a comment? Please read this for a possible explanation...