{"id":13745,"date":"2020-06-05T17:23:30","date_gmt":"2020-06-05T21:23:30","guid":{"rendered":"http:\/\/www.iri.com\/blog\/?p=13745"},"modified":"2022-05-24T15:16:29","modified_gmt":"2022-05-24T19:16:29","slug":"data-preparation-datadog-voractiy","status":"publish","type":"post","link":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/","title":{"rendered":"Feeding Datadog with Voracity Part 2: Data Preparation &#038; Transmission"},"content":{"rendered":"<p><i>This article is the second in a 4-part <a href=\"https:\/\/www.iri.com\/blog\/business-intelligence\/datadog-with-voracity\/\">series<\/a> on feeding the Datadog cloud analytic platform with different kinds of data from <\/i><a href=\"https:\/\/www.iri.com\/products\/voracity\"><i>IRI Voracity<\/i><\/a><i> operations. It focuses on preparing data in Voracity, and getting Datadog ready to receive it. Other articles in the series cover: the need for Voracity ahead of Datadog; displaying Voracity-wrangled data in Datadog; and, using <\/i><a href=\"https:\/\/www.iri.com\/products\/darkshield\"><i>IRI DarkShield<\/i><\/a><i> search logs in security analytics.<\/i><\/p>\n<p>My previous <a href=\"https:\/\/www.iri.com\/blog\/business-intelligence\/datadog-with-voracity\/\">article<\/a> provided an overview of Datadog and how IRI Voracity can accelerate Datadog visualizations through <a href=\"https:\/\/www.iri.com\/blog\/business-intelligence\/a-fresh-look-at-data-preparation\/\">external data preparation<\/a> (data wrangling), masking, cleansing, synthesis, etc. This article focuses on how you can prepare raw data for analytics in Voracity and send it seamlessly into Datadog, and the next article will document what to do with that data once it\u2019s in Datadog.<\/p>\n<p>Wrangling large data, particularly structured file inputs, has always been a core IRI strength. In this case, I am using the CoSort <a href=\"https:\/\/www.iri.com\/products\/cosort\/sortcl\">SortCL<\/a> program in the Voracity platform to perform a basic sort and filter operation of a very large CSV file containing UK company data (see the first article).<\/p>\n<h5><b>My Voracity Data Wrangling Job<\/b><\/h5>\n<p>Below is a view of the raw UK company input data, the SortCL-powered data wrangling jobs shown in its \u201ctransform mapping diagram\u201d form, and the Voracity output results for Datadog. They are managed in <a href=\"https:\/\/www.iri.com\/products\/workbench\">IRI Workbench<\/a>, the free graphical IDE for Voracity, built on Eclipse\u2122:<\/p>\n<p><a href=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/datadog-workbench-collage.png\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-13748 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2020\/06\/datadog-workbench-collage-1024x554.png\" alt=\"\" width=\"852\" height=\"461\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/datadog-workbench-collage-1024x554.png 1024w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/datadog-workbench-collage-300x162.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/datadog-workbench-collage-768x415.png 768w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/datadog-workbench-collage.png 1110w\" sizes=\"(max-width: 852px) 100vw, 852px\" \/><\/a><\/p>\n<p>The actual SortCL script that runs the job shown (and can also be edited) in IRI Workbench is:<\/p>\n<pre># Generated Automatically with IRI Workbench - New CoSort SortCL Job\r\n\u00a0# Author: Devon K\r\n\u00a0# Created: 2020-04-03 14:41:45\r\n\r\n\/INFILES=C:\/Users\/Devon\/Downloads\/UKCompanyData\/BasicCompanyDataAsOneFile-2020-04-01.csv\r\n\u00a0\u00a0\u00a0 \/PROCESS=CSV\r\n\u00a0\u00a0\u00a0 \/ALIAS=BASICCOMPANYDATAASONEFILE_2020_04_01_CSV\r\n\u00a0\u00a0\u00a0 \/FIELD=(COMPANYNAME, TYPE=ASCII, POSITION=1, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(_COMPANYNUMBER, TYPE=ASCII, POSITION=2, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(REGADDRESS_CAREOF, TYPE=ASCII, POSITION=3, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 # \/FIELD statements #4-54 redacted for brevity - contact me if you\u2019d like them\r\n    \/FIELD=(_CONFSTMTLASTMADEUPDATE, TYPE=ASCII, POSITION=55, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\r\n\/SORT\r\n\u00a0\u00a0\u00a0 \/KEY=(COMPANYNAME)\r\n\r\n\/OUTFILE=C:\/Users\/Devon\/Downloads\/UKCompanyData\/sortedUKdata.csv\r\n\u00a0\u00a0\u00a0 \/PROCESS=CSV\r\n\u00a0\u00a0\u00a0 \/FIELD=(COMPANYNAME, TYPE=ASCII, POSITION=1, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(REGADDRESS_ADDRESSLINE1mask=replace_chars(REGADDRESS_ADDRESSLINE1,\"x\"), TYPE=ASCII, POSITION=2, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(_REGADDRESS_ADDRESSLINE2, TYPE=ASCII, POSITION=3, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(REGADDRESS_POSTTOWN, TYPE=ASCII, POSITION=4, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(REGADDRESS_COUNTY, TYPE=ASCII, POSITION=5, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(REGADDRESS_COUNTRY, TYPE=ASCII, POSITION=6, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(REGADDRESS_POSTCODE, TYPE=ASCII, POSITION=7, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(COMPANYCATEGORY, TYPE=ASCII, POSITION=8, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(COMPANYSTATUS, TYPE=ASCII, POSITION=9, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(COUNTRYOFORIGIN, TYPE=ASCII, POSITION=10, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(DISSOLUTIONDATE, TYPE=ASCII, POSITION=11, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(INCORPORATIONDATE, TYPE=ASCII, POSITION=12, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(ACCOUNTS_ACCOUNTREFDAY, TYPE=ASCII, POSITION=13, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(ACCOUNTS_ACCOUNTREFMONTH, TYPE=ASCII, POSITION=14, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(ACCOUNTS_NEXTDUEDATE, TYPE=ASCII, POSITION=15, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(ACCOUNTS_LASTMADEUPDATE, TYPE=ASCII, POSITION=16, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(RETURNS_NEXTDUEDATE, TYPE=ASCII, POSITION=17, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(MORTGAGES_NUMMORTCHARGES, TYPE=ASCII, POSITION=18, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(MORTGAGES_NUMMORTOUTSTANDING, TYPE=ASCII, POSITION=19, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(MORTGAGES_NUMMORTPARTSATISFIED, TYPE=ASCII, POSITION=20, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(MORTGAGES_NUMMORTSATISFIED, TYPE=ASCII, POSITION=21, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(LIMITEDPARTNERSHIPS_NUMGENPARTNERS, TYPE=ASCII, POSITION=22, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(LIMITEDPARTNERSHIPS_NUMLIMPARTNERS, TYPE=ASCII, POSITION=23, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(URI, TYPE=ASCII, POSITION=24, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(CONFSTMTNEXTDUEDATE, TYPE=ASCII, POSITION=25, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\u00a0\u00a0\u00a0 \/FIELD=(_CONFSTMTLASTMADEUPDATE, TYPE=ASCII, POSITION=26, SEPARATOR=\",\", FRAME=\"\\\"\")\r\n\/INCLUDE WHERE MORTGAGES_NUMMORTOUTSTANDING GT 1\r\n\/OMIT WHERE REGADDRESS_POSTTOWN CT \"Cambridge\"\r\n\/INCLUDE WHERE COUNTRYOFORIGIN CT \"England\"<\/pre>\n<p>Note that this job script was automatically generated by an <a href=\"https:\/\/www.iri.com\/products\/workbench\/cosort-gui\">IRI Workbench<\/a> wizard. Workbench provides a syntax- aware editor for those who prefer to manually interact with the metadata, as well as mapping diagrams and outlines shown above, or graphical dialogs and form editors. Changes made in any one mode reflect in the others (\u201cdifferent strokes for different folks\u201d).<\/p>\n<p>Now that I have a rapidly-made, BI-ready target file (or \u201clog\u201d in Datadog parlance), I can feed it to Datadog in real-time!<\/p>\n<h5><b>The Datadog Feeder Agent<\/b><\/h5>\n<p>To get started, a Datadog <a href=\"https:\/\/docs.datadoghq.com\/getting_started\/agent\/?tab=datadogussite\">agent should be installed and running<\/a> on the same machine(s) where the IRI output or log data will go. Datadog, like Voracity, supports AIX, Linux, macOS, and Windows. You thus need access to the IRI data target machine(s) to configure the agent.<\/p>\n<p>The Datadog agent can be installed on multiple computers and still send logs to the same web interface, which is accessed just by logging into your Datadog account. Go to the Datadog base directory (varies by O\/S) on the agent machine, and see the <a href=\"https:\/\/docs.datadoghq.com\/agent\/guide\/agent-configuration-files\/?tab=agentv6v7#agent-main-configuration-file\">configuration advice here<i>.<\/i>\u00a0<\/a><\/p>\n<p>The main configuration file (<b>datadog.yaml<\/b>) must be specified to have logs enabled: (\u201clogs_enabled: true\u201d). This is one of the settings in the datadog.yaml file, which lists many settings regarding the configuration of the Datadog agent.<\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2020\/06\/datadog-yaml.png\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-13752 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2020\/06\/datadog-yaml.png\" alt=\"\" width=\"460\" height=\"628\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/datadog-yaml.png 508w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/datadog-yaml-220x300.png 220w\" sizes=\"(max-width: 460px) 100vw, 460px\" \/><\/a><\/p>\n<p>Next, create a directory within the <i>conf.d<\/i> folder called <i>CoSort.d<\/i>. This can be called anything, as long as it ends with \u201c.d\u201d. Create a <b>conf.yaml <\/b>file in this directory, and format it like this:<\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2020\/06\/conf-yaml.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-13753 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2020\/06\/conf-yaml.png\" alt=\"\" width=\"848\" height=\"310\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/conf-yaml.png 848w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/conf-yaml-300x110.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/conf-yaml-768x281.png 768w\" sizes=\"(max-width: 848px) 100vw, 848px\" \/><\/a><\/p>\n<p>Everything following a \u201c#\u201d is a comment. The example above includes several that describe the structure of a typical conf.yaml used to specify locations to collect logs.<\/p>\n<p>Asterisks can be used to specify all of a certain file type, directory, or files or directories that start with certain letters. Tags can also be added to provide more detail when searching the logs later.<\/p>\n<p>If the type of log input source in this configuration file is tcp, then the logs will be sent in real-time. If the type of input is specified as a file, then logs will be sent in \u201cbatches\u201d via https every 1-10 seconds, depending on a setting that can be specified in the datadog.yaml file.<\/p>\n<p>I also set up a similar <b>conf.yaml<\/b> file in a <i>DarkShield.d<\/i> directory within the<i> conf.d<\/i> directory for work I will demonstrate in Article #4 of this series. This yaml file will log all DarkShield .search files in a specified directory:<\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2020\/06\/conf-yaml-2.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-13754 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2020\/06\/conf-yaml-2.png\" alt=\"\" width=\"844\" height=\"311\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/conf-yaml-2.png 844w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/conf-yaml-2-300x111.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/conf-yaml-2-768x283.png 768w\" sizes=\"(max-width: 844px) 100vw, 844px\" \/><\/a><\/p>\n<h5><b>Enable Agent Permission<\/b><\/h5>\n<p>The Datadog agent needs system permission to collect the data in the Voracity target directories. On Windows, right click on the lowest level folder needed to be granted permission to, and select \u201cadd\u201d to add a user. On Linux, you may have to chmod the directory.<\/p>\n<p>A user called <b>ddagentuser<\/b> should have been created on installation of the Datadog agent. Type this into the \u201cEnter the object names to select\u201d text box, as shown in the image below:<\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2020\/06\/ddagentuser.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-13755 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2020\/06\/ddagentuser.png\" alt=\"\" width=\"457\" height=\"251\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/ddagentuser.png 457w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/ddagentuser-300x165.png 300w\" sizes=\"(max-width: 457px) 100vw, 457px\" \/><\/a><\/p>\n<p>Click \u201cOK\u201d to add the user, then make sure both <i>read <\/i>and <i>write <\/i>permissions exist for the ddagentuser. <i>Modify <\/i>permissions need not be granted. Permissions should now look like this:<\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2020\/06\/ddagentuser-permisions.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-13756 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2020\/06\/ddagentuser-permisions.png\" alt=\"\" width=\"363\" height=\"450\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/ddagentuser-permisions.png 363w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/ddagentuser-permisions-242x300.png 242w\" sizes=\"(max-width: 363px) 100vw, 363px\" \/><\/a><\/p>\n<h5><b>Multiple Configurations for Multiple Feeds<\/b><\/h5>\n<p>You can set up as many configurations as needed to allow all the data you want to be logged.<\/p>\n<p>For example, if you want all .csv files in a certain CoSort target file directory, follow the pattern: <i>path\/to\/directory\/*.csv<\/i> . If you want all DarkShield<i> .search <\/i>files in a directory already in your path to be logged, just specify<i> *.search<\/i> with no prior directory.<\/p>\n<p>Only new files will be logged; existing files will not. So, set up your Datadog logging configurations before running any IRI jobs in order to not miss any existing or future logs.<\/p>\n<h5><b>Restart the Agent to Register Your Changes<\/b><\/h5>\n<p>After you modify the Datadog agent configuration above, you need to restart the agent for those changes to be registered within, and get the feeds from Voracity going into Datadog. To restart:<\/p>\n<ol>\n<li>Navigate to the directory that contains the Datadog executable in an administrative PowerShell, or terminal on Linux. On Windows, look in C:\\Program Files\\Datadog\\Datadog Agent\\bin.<\/li>\n<li>Run the command .\/agent restart-service to restart the Datadog agent.<\/li>\n<li>Run the command .agent \/status to confirm the agent will send IRI data to Datadog<\/li>\n<\/ol>\n<p>From the status display shown below, you can see that I specified a CoSort log source for all .out files in a certain directory, and that an output file from SortCL was logged into Datadog. By default, the Datadog agent will send the content of the files (as \u201clogs\u201d) to the Datadog server via TCP. Logs can also be sent via HTTPS.<\/p>\n<p>HTTPS is more reliable than TCP and is the setting Datadog recommends. However, TCP logs in real-time, while HTTPS gathers logs in \u201cbatches\u201d every 1-10 seconds. The HTTPS gather time can be modified in another setting in the Datadog.yaml file in the main Datadog directory.<\/p>\n<p>HTTPS will become the default setting in future releases of Datadog. HTTPS can be set by going to the logs_config setting in Datadog.yaml and setting the following two sub-settings to <i>true<\/i>: use_http, and use_port_443.<\/p>\n<p>I also specified a DarkShield source that will log all .search files in a specified directory produced by the DarkShield<i> Dark Data Search\/Masking Job<\/i> wizard in IRI Workbench:<\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2020\/06\/datadog-darkshield-powershell-cropped.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-13758 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2020\/06\/datadog-darkshield-powershell-cropped.png\" alt=\"\" width=\"377\" height=\"726\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/datadog-darkshield-powershell-cropped.png 377w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/datadog-darkshield-powershell-cropped-156x300.png 156w\" sizes=\"(max-width: 377px) 100vw, 377px\" \/><\/a><\/p>\n<h5><b>Setting Up the Datadog Processing Pipeline<\/b><\/h5>\n<p>Before we actually feed Datadog, we have another step to consider and configure &#8212; to create pipelines and parsing rules to extract attributes from the logs, and turn them into \u2018facets\u2019 that can be used as values in visualizations.<\/p>\n<p>To make the best use of visualizations and analytics, attributes will need to be parsed from the logs in Datadog. However, to create a custom parser to extract these attributes, a pipeline must first be created.<\/p>\n<p>You can create a pipeline under the <i>Logs &gt; Configuration<\/i> section of the Datadog web interface, accessed simply by logging into your Datadog account. A pipeline allows you to filter logs based on facets and queries. Once a pipeline has been created, <a href=\"https:\/\/docs.datadoghq.com\/logs\/processing\/parsing\/?tab=matcher#overview\">parsing<\/a> rules can be generated for all the logs that will fall within the scope of that pipeline by using a processor.<\/p>\n<p>Note that custom parsing rules are only necessary for logs of a format that is not XML or JSON. If custom parsing rules are necessary for your logs, this step should be taken before sending any logs to Datadog. This is so that Datadog will be able to extract attributes from your logs from the start. Otherwise, attributes will only be able to be extracted from newly incoming logs, not any existing logs.<\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2020\/06\/datadog-pipelines.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-13759 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2020\/06\/datadog-pipelines.png\" alt=\"\" width=\"424\" height=\"489\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/datadog-pipelines.png 424w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/datadog-pipelines-260x300.png 260w\" sizes=\"(max-width: 424px) 100vw, 424px\" \/><\/a><\/p>\n<p>Datadog has several <a href=\"https:\/\/docs.datadoghq.com\/logs\/processing\/processors\/?tab=ui#grok-parser\">processors<\/a>; I will be using the Grok Parser. To use the Grok Parser, click on <b><i>Add Processor<\/i><\/b>\u00a0 underneath the pipeline you want to have log attributes parsed from. Enter a sample of the type of log you want to process. Then, enter parsing rules to extract attributes from your data.<\/p>\n<p>If you entered a log sample, Datadog will automatically show you how your log will be parsed based on your parsing rules as you type them. Datadog parsing <a href=\"https:\/\/docs.datadoghq.com\/logs\/processing\/parsing\/?tab=matcher#overview\">rules<\/a> are quite powerful and diverse. Ultimately, the attributes are extracted in JSON format:<\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2020\/06\/edit-gork-parser.png\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-13760 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2020\/06\/edit-gork-parser.png\" alt=\"\" width=\"819\" height=\"607\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/edit-gork-parser.png 968w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/edit-gork-parser-300x222.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/edit-gork-parser-768x569.png 768w\" sizes=\"(max-width: 819px) 100vw, 819px\" \/><\/a><\/p>\n<p>If attributes are being properly extracted from the logs, all the attributes should be visible by clicking on one of the log entries.<\/p>\n<p><a href=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/datadog-log-entries.png\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-13761 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2020\/06\/datadog-log-entries-1024x555.png\" alt=\"\" width=\"851\" height=\"461\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/datadog-log-entries-1024x555.png 1024w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/datadog-log-entries-300x163.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/datadog-log-entries-768x416.png 768w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/datadog-log-entries.png 1110w\" sizes=\"(max-width: 851px) 100vw, 851px\" \/><\/a><\/p>\n<p>Datadog can only extract attributes from logs sent if these pipeline and processing rules exist first, unless the logs are in XML or JSON format.<\/p>\n<h5><b>Log File Attributes (Facets)<\/b><\/h5>\n<p>A facet provides additional information about logs that can be used to filter, search, and create visualizations. Facets can be generated from attributes associated with logs in Log Explorer.<\/p>\n<p>Attributes are specific slices of data within a log, such as a single field in a record. A facet for a filename, for example, can be set up prior to logs being sent by clicking <i>Add Facet<\/i> and choosing a predetermined suggestion for a filename facet that appears.<\/p>\n<p>To generate a facet from attributes, simply click on the log, then click on the attribute you want to generate a facet with. A menu will appear with the option to add a facet:<\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2020\/06\/datadog-facets.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-13762 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2020\/06\/datadog-facets.png\" alt=\"\" width=\"254\" height=\"750\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/datadog-facets.png 254w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/datadog-facets-102x300.png 102w\" sizes=\"(max-width: 254px) 100vw, 254px\" \/><\/a><\/p>\n<p>While attributes are necessary to generate facets based on the data within the file and not the properties of the file, just having attributes extracted from logs will not allow you to create visualizations. The attribute must be also added as a facet to allow that functionality.<\/p>\n<p>Choosing the filename facet will keep track of the filename that the data from the log came from. With this facet selected, it is easier to narrow search results to a specific file or files. You can also specify the specific time that the file was sent to Datadog to narrow down query results, but setting up a facet for filename is generally more effective and simple.<\/p>\n<h5><b>Next Steps<\/b><\/h5>\n<p>At this point, everything should be set up for Voracity log collection in Datadog to occur. Once in Datadog, it will be easy to turn this wrangled data into useful analytic query results and visualizations. In the next <a href=\"https:\/\/www.iri.com\/blog\/business-intelligence\/datadog-collecting-leveraging-data\/\">article<\/a> of this series, I will demonstrate collection and visualization, and discuss Datadog\u2019s ability to send alerts based on certain specified values or thresholds.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This article is the second in a 4-part series on feeding the Datadog cloud analytic platform with different kinds of data from IRI Voracity operations. It focuses on preparing data in Voracity, and getting Datadog ready to receive it. Other articles in the series cover: the need for Voracity ahead of Datadog; displaying Voracity-wrangled data<\/p>\n<div><a class=\"btn-filled btn\" href=\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/\" title=\"Feeding Datadog with Voracity Part 2: Data Preparation &#038; Transmission\">Read More<\/a><\/div>\n","protected":false},"author":119,"featured_media":13753,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"footnotes":""},"categories":[108,32],"tags":[273,373,359,1163,1472,100,789,850,981],"class_list":["post-13745","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-big-data-2","category-business-intelligence","tag-bi","tag-bi-tool-acceleration","tag-data-preparation","tag-data-wrangling","tag-datadog","tag-etl","tag-iri-voracity","tag-iri-workbench","tag-logging"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v23.4 (Yoast SEO v23.4) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Feeding Datadog with Voracity Part 2: Data Preparation &amp; Transmission - IRI<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Feeding Datadog with Voracity Part 2: Data Preparation &amp; Transmission\" \/>\n<meta property=\"og:description\" content=\"This article is the second in a 4-part series on feeding the Datadog cloud analytic platform with different kinds of data from IRI Voracity operations. It focuses on preparing data in Voracity, and getting Datadog ready to receive it. Other articles in the series cover: the need for Voracity ahead of Datadog; displaying Voracity-wrangled dataRead More\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/\" \/>\n<meta property=\"og:site_name\" content=\"IRI\" \/>\n<meta property=\"article:published_time\" content=\"2020-06-05T21:23:30+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2022-05-24T19:16:29+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/conf-yaml.png\" \/>\n\t<meta property=\"og:image:width\" content=\"848\" \/>\n\t<meta property=\"og:image:height\" content=\"310\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Devon Kozenieski\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Devon Kozenieski\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"11 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/\"},\"author\":{\"name\":\"Devon Kozenieski\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/person\/de972c035aaeecfc40a3ae2ea5ff7ba1\"},\"headline\":\"Feeding Datadog with Voracity Part 2: Data Preparation &#038; Transmission\",\"datePublished\":\"2020-06-05T21:23:30+00:00\",\"dateModified\":\"2022-05-24T19:16:29+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/\"},\"wordCount\":1808,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/conf-yaml.png\",\"keywords\":[\"BI\",\"bi tool acceleration\",\"data preparation\",\"data wrangling\",\"DataDog\",\"ETL\",\"IRI Voracity\",\"IRI Workbench\",\"logging\"],\"articleSection\":[\"Big Data\",\"Business Intelligence (BI&#041;\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/\",\"url\":\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/\",\"name\":\"Feeding Datadog with Voracity Part 2: Data Preparation & Transmission - IRI\",\"isPartOf\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/conf-yaml.png\",\"datePublished\":\"2020-06-05T21:23:30+00:00\",\"dateModified\":\"2022-05-24T19:16:29+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/#primaryimage\",\"url\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/conf-yaml.png\",\"contentUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/conf-yaml.png\",\"width\":848,\"height\":310},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.iri.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Feeding Datadog with Voracity Part 2: Data Preparation &#038; Transmission\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.iri.com\/blog\/#website\",\"url\":\"https:\/\/www.iri.com\/blog\/\",\"name\":\"IRI\",\"description\":\"Total Data Management Blog\",\"publisher\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.iri.com\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.iri.com\/blog\/#organization\",\"name\":\"IRI\",\"url\":\"https:\/\/www.iri.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"contentUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"width\":750,\"height\":206,\"caption\":\"IRI\"},\"image\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/person\/de972c035aaeecfc40a3ae2ea5ff7ba1\",\"name\":\"Devon Kozenieski\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/e4c421588c1a85dd9a76146fe15528f7?s=96&d=blank&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/e4c421588c1a85dd9a76146fe15528f7?s=96&d=blank&r=g\",\"caption\":\"Devon Kozenieski\"},\"url\":\"https:\/\/www.iri.com\/blog\/author\/devonk\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Feeding Datadog with Voracity Part 2: Data Preparation & Transmission - IRI","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/","og_locale":"en_US","og_type":"article","og_title":"Feeding Datadog with Voracity Part 2: Data Preparation & Transmission","og_description":"This article is the second in a 4-part series on feeding the Datadog cloud analytic platform with different kinds of data from IRI Voracity operations. It focuses on preparing data in Voracity, and getting Datadog ready to receive it. Other articles in the series cover: the need for Voracity ahead of Datadog; displaying Voracity-wrangled dataRead More","og_url":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/","og_site_name":"IRI","article_published_time":"2020-06-05T21:23:30+00:00","article_modified_time":"2022-05-24T19:16:29+00:00","og_image":[{"width":848,"height":310,"url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/conf-yaml.png","type":"image\/png"}],"author":"Devon Kozenieski","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Devon Kozenieski","Est. reading time":"11 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/#article","isPartOf":{"@id":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/"},"author":{"name":"Devon Kozenieski","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/person\/de972c035aaeecfc40a3ae2ea5ff7ba1"},"headline":"Feeding Datadog with Voracity Part 2: Data Preparation &#038; Transmission","datePublished":"2020-06-05T21:23:30+00:00","dateModified":"2022-05-24T19:16:29+00:00","mainEntityOfPage":{"@id":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/"},"wordCount":1808,"commentCount":0,"publisher":{"@id":"https:\/\/www.iri.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/#primaryimage"},"thumbnailUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/conf-yaml.png","keywords":["BI","bi tool acceleration","data preparation","data wrangling","DataDog","ETL","IRI Voracity","IRI Workbench","logging"],"articleSection":["Big Data","Business Intelligence (BI&#041;"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/","url":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/","name":"Feeding Datadog with Voracity Part 2: Data Preparation & Transmission - IRI","isPartOf":{"@id":"https:\/\/www.iri.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/#primaryimage"},"image":{"@id":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/#primaryimage"},"thumbnailUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/conf-yaml.png","datePublished":"2020-06-05T21:23:30+00:00","dateModified":"2022-05-24T19:16:29+00:00","breadcrumb":{"@id":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/#primaryimage","url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/conf-yaml.png","contentUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/conf-yaml.png","width":848,"height":310},{"@type":"BreadcrumbList","@id":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-preparation-datadog-voractiy\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.iri.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Feeding Datadog with Voracity Part 2: Data Preparation &#038; Transmission"}]},{"@type":"WebSite","@id":"https:\/\/www.iri.com\/blog\/#website","url":"https:\/\/www.iri.com\/blog\/","name":"IRI","description":"Total Data Management Blog","publisher":{"@id":"https:\/\/www.iri.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.iri.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.iri.com\/blog\/#organization","name":"IRI","url":"https:\/\/www.iri.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","contentUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","width":750,"height":206,"caption":"IRI"},"image":{"@id":"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/person\/de972c035aaeecfc40a3ae2ea5ff7ba1","name":"Devon Kozenieski","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/e4c421588c1a85dd9a76146fe15528f7?s=96&d=blank&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/e4c421588c1a85dd9a76146fe15528f7?s=96&d=blank&r=g","caption":"Devon Kozenieski"},"url":"https:\/\/www.iri.com\/blog\/author\/devonk\/"}]}},"jetpack_featured_media_url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2020\/06\/conf-yaml.png","_links":{"self":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts\/13745"}],"collection":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/users\/119"}],"replies":[{"embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/comments?post=13745"}],"version-history":[{"count":12,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts\/13745\/revisions"}],"predecessor-version":[{"id":15861,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts\/13745\/revisions\/15861"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/media\/13753"}],"wp:attachment":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/media?parent=13745"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/categories?post=13745"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/tags?post=13745"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}