{"id":13061,"date":"2019-08-08T17:51:48","date_gmt":"2019-08-08T21:51:48","guid":{"rendered":"http:\/\/www.iri.com\/blog\/?p=13061"},"modified":"2026-02-23T17:08:37","modified_gmt":"2026-02-23T22:08:37","slug":"mongodb-cassandra-darkshield","status":"publish","type":"post","link":"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/","title":{"rendered":"Masking PII in MongoDB, Cassandra, and Elasticsearch with DarkShield: 4th IRI Method"},"content":{"rendered":"<p><em><strong>Editor&#8217;s Note:<\/strong><\/em><\/p>\n<p><em>As of 2024 and DarkShield V5, the search and masking methods for these NoSQL DBs has been upgraded; please refer to the discussion and demonstration in <a href=\"https:\/\/www.iri.com\/blog\/data-protection\/finding-and-masking-nosql-dbs-with-the-darkshield-gui\/\">this updated article<\/a> instead!<\/em><\/p>\n<p><span style=\"font-weight: 400;\">This article demonstrates the use of <\/span><a href=\"https:\/\/www.iri.com\/products\/darkshield\"><span style=\"font-weight: 400;\">IRI DarkShield<\/span><\/a><span style=\"font-weight: 400;\"> to identify and remediate (<\/span><a href=\"https:\/\/www.iri.com\/solutions\/data-masking\"><span style=\"font-weight: 400;\">mask<\/span><\/a><span style=\"font-weight: 400;\">) personally identifiable information (PII) and other sensitive data in MongoDB, Cassandra, and Elasticsearch databases. Although these steps mainly focus on finding and shielding data in MongoDB collections, the same steps can be used for data in Cassandra tables as well. See <a href=\"https:\/\/www.iri.com\/blog\/data-protection\/find-and-mask-elasticsearch\/\">this article<\/a> on Elasticsearch, too.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Note that this article represents the <\/span><a href=\"https:\/\/www.iri.com\/blog\/data-protection\/native-mongodb-masking-voracity\/\"><span style=\"font-weight: 400;\">fouth method<\/span><\/a><span style=\"font-weight: 400;\"> IRI supports for masking data in MongoDB, and the <\/span><a href=\"https:\/\/www.iri.com\/blog\/vldb-operations\/how-to-mask-cassandra-datastax-with-iri-fieldshield\/\"><span style=\"font-weight: 400;\">second method<\/span><\/a><span style=\"font-weight: 400;\"> for Cassandra. Those previous\u00a0 and still supported methods rely on structured data discovery and de-identification via\u00a0<\/span><a href=\"https:\/\/www.iri.com\/products\/fieldshield\"><span style=\"font-weight: 400;\">IRI FieldShield<\/span><\/a><span style=\"font-weight: 400;\">, while the DarkShield method supports textual data in either structured or unstructured collections.<span id='easy-footnote-1-13061' class='easy-footnote-margin-adjust'><\/span><span class='easy-footnote'><a href='https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/#easy-footnote-bottom-1-13061' title='If PII is embedded in binary objects within your MongoDB, Cassandra, Elasticsearch collections, we can help automate their extraction to standalone files for DarkShield search\/mask operations, and their reimport.'><sup>1<\/sup><\/a><\/span> Though DarkShield and FieldShield are standalone IRI data masking products, both are included in the <a href=\"https:\/\/www.iri.com\/products\/voracity\">IRI Voracity<\/a> data management platform.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The latest approach uses some elements of <\/span><i><span style=\"font-weight: 400;\">Data Classification<\/span><\/i><span style=\"font-weight: 400;\">, an integrated data cataloging paradigm for defining the search methods used for finding PII independently from the source of the data. While this article provides a small introduction to Data Classification during Step 1, you may find it useful to read up on how Data Classification fits into our unified approach to conducting searches. For more information on Data Classification in the <\/span><a href=\"https:\/\/www.iri.com\/products\/workbench\"><span style=\"font-weight: 400;\">IRI Workbench<\/span><\/a><span style=\"font-weight: 400;\"> front end for DarkShield et al<span id='easy-footnote-2-13061' class='easy-footnote-margin-adjust'><\/span><span class='easy-footnote'><a href='https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/#easy-footnote-bottom-2-13061' title='&lt;\/span&gt;&lt;span style=&quot;font-weight: 400;&quot;&gt;The &lt;\/span&gt;&lt;a href=&quot;https:\/\/www.iri.com\/products\/workbench&quot;&gt;&lt;span style=&quot;font-weight: 400;&quot;&gt;IRI Workbench&lt;\/span&gt;&lt;\/a&gt;&lt;span style=&quot;font-weight: 400;&quot;&gt; IDE, built on Eclipse\u2122,\u00a0 front-ends all FieldShield, DarkShield, and related data masking &amp;#8212; and broader data handling\u00a0&lt;\/span&gt;&lt;a href=&quot;https:\/\/www.iri.com\/products\/voracity\/technical-details#capabilities&quot;&gt;&lt;span style=&quot;font-weight: 400;&quot;&gt;capabilities&lt;\/span&gt;&lt;\/a&gt;&lt;span style=&quot;font-weight: 400;&quot;&gt; &amp;#8212; in the &lt;\/span&gt;&lt;a href=&quot;https:\/\/www.iri.com\/products\/voracity&quot;&gt;&lt;span style=&quot;font-weight: 400;&quot;&gt;IRI Voracity&lt;\/span&gt;&lt;\/a&gt;&lt;span style=&quot;font-weight: 400;&quot;&gt; platform.&lt;\/span&gt;&lt;span style=&quot;font-weight: 400;&quot;&gt;'><sup>2<\/sup><\/a><\/span><\/span><span style=\"font-weight: 400;\">, please read <\/span><a href=\"https:\/\/www.iri.com\/blog\/data-protection\/data-classification-in-iri-workbench\/\"><span style=\"font-weight: 400;\">this article<\/span><\/a><span style=\"font-weight: 400;\"> before proceeding.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The identification and remediation of PII using IRI DarkShield involves 4 general steps:<\/span><\/p>\n<p><b>(Optional) Step &#8211; Register Your Data Source(s)<\/b><\/p>\n<p><span style=\"font-weight: 400;\">In this (optional) step, data sources for a Mongo database, Cassandra keyspace, or Elasticsearch cluster are registered. This allows data sources to be reusable. As a result, this step is optional if the desired data source already exists in the registry, or if you plan on defining them from the wizard.<\/span><\/p>\n<p><b>Step 1 &#8211; Specify Search Parameters<\/b><\/p>\n<p><span style=\"font-weight: 400;\">Here all aspects of a search are chosen. First, a source and target collection\/table are set up based on the data connection specified in the registry or created in the wizard. Then, you can specify the search and remediation criteria for your data using Search Matchers the kinds of information to look for and how that information should be remediated. The result of this step is a .<\/span><i><span style=\"font-weight: 400;\">search<\/span><\/i><span style=\"font-weight: 400;\"> file.<\/span><\/p>\n<p><b>Step 2 &#8211; Conduct a Search<\/b><\/p>\n<p><span style=\"font-weight: 400;\">A search can be run from a <\/span><i><span style=\"font-weight: 400;\">.search<\/span><\/i><span style=\"font-weight: 400;\"> file. The result is a <\/span><i><span style=\"font-weight: 400;\">.darkdata<\/span><\/i><span style=\"font-weight: 400;\"> file annotating any identified PII.<\/span><\/p>\n<p><b>Step 3 &#8211; Remediation (Masking)<\/b><\/p>\n<p><span style=\"font-weight: 400;\">Remediation can be done from a <\/span><i><span style=\"font-weight: 400;\">.darkdata<\/span><\/i><span style=\"font-weight: 400;\"> file. Any identified PII will be remediated in the manner specified during search creation.<\/span><\/p>\n<h2>(Optional) Step &#8211; Register Your Data Sources<\/h2>\n<p><span style=\"font-weight: 400;\">As a prerequisite step, you will need to register the connections to your online data sources (and targets) in the URL Connection Registry, which is located in the <\/span><i><span style=\"font-weight: 400;\">Preferences &gt; IRI &gt; URL Connection Registry <\/span><\/i><span style=\"font-weight: 400;\">dialog in IRI Workbench. <\/span><\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-1.png\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-13067 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-1.png\" alt=\"\" width=\"637\" height=\"334\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-1.png 752w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-1-300x157.png 300w\" sizes=\"(max-width: 637px) 100vw, 637px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">All URL Connections, including URI connection strings for MongoDB, Cassandra, and Elasticsearch can be saved. This allows for the URLs, authentication credentials, and any additional parameters to be saved and stored by IRI Workbench for future use.<span id='easy-footnote-3-13061' class='easy-footnote-margin-adjust'><\/span><span class='easy-footnote'><a href='https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/#easy-footnote-bottom-3-13061' title='&lt;\/span&gt;&lt;span style=&quot;font-weight: 400;&quot;&gt;T&lt;\/span&gt;&lt;span style=&quot;font-weight: 400;&quot;&gt;he URL Connection Registry is used to configure and save URL-based data sources used in DarkShield search\/mask and CoSort\/SortCL (Voracity) ETL operations; e.g., HDFS, Kafka, S3 buckets, MongoDB, S\/FTP. This registry is similar, but not identical, to the &lt;\/span&gt;&lt;span style=&quot;font-weight: 400;&quot;&gt;Data Connection registry in IRI Workbench for relational databases sources where ODBC DSNs are bridged to corresponding JDBC connection profiles for the benefit of job wizards leveraging both connections.'><sup>3<\/sup><\/a><\/span><\/span><\/p>\n<h2>Step 1 &#8211; Specify Search Parameters (Create .Search File)<\/h2>\n<p><span style=\"font-weight: 400;\">In the <\/span><a href=\"https:\/\/www.iri.com\/products\/workbench\"><span style=\"font-weight: 400;\">IRI Workbench<\/span><\/a><span style=\"font-weight: 400;\"> IDE for DarkShield, Select New Database Discovery Job from the DarkShield Menu.\u00a0<\/span><span style=\"font-weight: 400;\">Select a project folder and enter a name for the job:<\/span><\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-3.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-13069 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-3.png\" alt=\"\" width=\"426\" height=\"351\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-3.png 546w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-3-300x247.png 300w\" sizes=\"(max-width: 426px) 100vw, 426px\" \/><\/a><\/p>\n<h4><span style=\"font-weight: 400;\">Specifying a Source and Target<\/span><\/h4>\n<p><span style=\"font-weight: 400;\">Any Mongo, Cassandra or ElasticsearchURLs that were previously created and saved in the registry can be accessed from the <\/span><i><span style=\"font-weight: 400;\">URI <\/span><\/i><span style=\"font-weight: 400;\">dropdown for both the Source and Target selectors. The name of the corresponding MongoDB collection, Cassandra table, or Elasticsearch index will also have to be entered:<\/span><\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-4.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-13070 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-4.png\" alt=\"\" width=\"420\" height=\"347\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-4.png 546w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-4-300x247.png 300w\" sizes=\"(max-width: 420px) 100vw, 420px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">A new URI can also be created by pressing the <\/span><i><span style=\"font-weight: 400;\">New <\/span><\/i><span style=\"font-weight: 400;\">button. This will open the URL Connection Details dialog. Enter a name for the connection, select the desired scheme, enter the host, and enter the database. If no port is present the default port for the scheme will be assumed.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">A username and password can also be supplied if the database requires authorization. Any new URL connections will be saved in the URL Connection Registry, and can be re-used as a target.\u00a0<\/span><\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-5.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-13071 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-5.png\" alt=\"\" width=\"420\" height=\"362\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-5.png 685w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-5-300x259.png 300w\" sizes=\"(max-width: 420px) 100vw, 420px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">After a source is specified, you can continue to the next page to select or create a target URI. The scheme of the target URI will be limited to the selected source URI, so a MongoDB source can only be sent to another MongoDB target, and similarly for Cassandra or Elasticsearch.<\/span><\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-6.png\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-13072 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-6.png\" alt=\"\" width=\"423\" height=\"349\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-6.png 546w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-6-300x247.png 300w\" sizes=\"(max-width: 423px) 100vw, 423px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">When a masking job is run, all rows in the source will be appended to the target, and any rows with matching keys will be overwritten. For Cassandra, ensure that the target table schema is compatible with data from the source table.<\/span><\/p>\n<h4><span style=\"font-weight: 400;\">Adding Search Matchers<\/span><\/h4>\n<p><span style=\"font-weight: 400;\">After both a source and target are specified, you can go to the next page to add search matchers<\/span><span style=\"font-weight: 400;\">.<span id='easy-footnote-4-13061' class='easy-footnote-margin-adjust'><\/span><span class='easy-footnote'><a href='https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/#easy-footnote-bottom-4-13061' title='A Search Matcher is an association between a &lt;i&gt;Data Class&lt;\/i&gt;, which is used to define the search method for finding and classifying PII, and a &lt;i&gt;Data Rule&lt;\/i&gt; which will be applied to any instance of the Data Class found in the collection or table. Additionally, Search Matchers allow you to define filters which can be used to reduce the scope of the search. This is particularly useful in Mongo collections, Cassandra tables, and Elasticsearch indexes because the key name can be indicative of the PII which is stored in the corresponding value.'><sup>4<\/sup><\/a><\/span> Select a library location containing any Patterns or Rules libraries which you wish to use and click <\/span><i><span style=\"font-weight: 400;\">Add <\/span><\/i><span style=\"font-weight: 400;\">to add a new Search Matcher.<\/span><\/p>\n<h4><span style=\"font-weight: 400;\">KeyNameMatcher<\/span><\/h4>\n<p><span style=\"font-weight: 400;\">The first Search Matcher that we will create will be used to match the entire value corresponding to any \u201cname\u201d key located in arbitrarily nested json structures and apply a Format Preserving Encryption algorithm to mask it. We can achieve this by creating a JSON path filter \u201c$..name\u201d. More information about JSON\u00a0 paths can be found <\/span><a href=\"https:\/\/goessner.net\/articles\/JsonPath\/\"><span style=\"font-weight: 400;\">here<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-7.png\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-13073 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-7.png\" alt=\"\" width=\"418\" height=\"362\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-7.png 570w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-7-300x260.png 300w\" sizes=\"(max-width: 418px) 100vw, 418px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">Since MongoDB collections, Cassandra tables, and Elasticsearch indexes are parsed by DarkShield as json documents, the filter can be applied to both in order to mask any value corresponding to any \u201cname\u201d key.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">To match on the contents of the filtered data, we need to create a new <\/span><i><span style=\"font-weight: 400;\">Data Class<\/span><\/i><span style=\"font-weight: 400;\">. A Data Class represents PII and the associated matchers used to identify it. These matchers can include any combination of:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Regular Expression Patterns<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Set file dictionary lookups<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Named Entity Recognition Models<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Bounding Box Matchers (images only)<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Facial Recognition (images only)<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">You can define Data Classes within the wizard or by opening the <\/span><i><span style=\"font-weight: 400;\">Data Classes and Groups <\/span><\/i><span style=\"font-weight: 400;\">page in the <\/span><i><span style=\"font-weight: 400;\">IRI Preferences<\/span><\/i><span style=\"font-weight: 400;\">. The Data Classes defined within the preferences can be used in both FieldShield and DarkShield for other data sources, including structured and unstructured data.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">We can create an associated <\/span><i><span style=\"font-weight: 400;\">EVERYTHING <\/span><\/i><span style=\"font-weight: 400;\">Data Class for this matcher which will match on the entire contents of the value, since we are reasonably sure that all we will find in the values are names. You can use set file lookups containing a dictionary of names if you are unsure of the contents of your \u201cname\u201d keys or if you wish to mask only a subset of names.<\/span><\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-8.png\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-13074 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-8.png\" alt=\"\" width=\"419\" height=\"313\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-8.png 603w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-8-300x224.png 300w\" sizes=\"(max-width: 419px) 100vw, 419px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">For the <\/span><i><span style=\"font-weight: 400;\">Rule Name<\/span><\/i><span style=\"font-weight: 400;\"> field of the KeyNameMatcher, we can select an existing Data Rule from the library location we have selected, or create a new rule that uses Format Preserving Encryption (<\/span><a href=\"https:\/\/www.iri.com\/solutions\/data-masking\/static-data-masking\/encrypt\/format-preserving-encryption\"><span style=\"font-weight: 400;\">FPE<\/span><\/a><span style=\"font-weight: 400;\">), for example:<\/span><\/p>\n<p><span style=\"font-weight: 400;\">To create an FPE rule, click <\/span><i><span style=\"font-weight: 400;\">Create <\/span><\/i><span style=\"font-weight: 400;\">next to the <\/span><i><span style=\"font-weight: 400;\">Rule Name <\/span><\/i><span style=\"font-weight: 400;\">field, select <\/span><i><span style=\"font-weight: 400;\">Encryption or Decryption Functions<\/span><\/i><span style=\"font-weight: 400;\"> from the Data Rule Wizard that appears:<\/span><\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-9.png\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-13075 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-9.png\" alt=\"\" width=\"420\" height=\"442\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-9.png 525w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-9-285x300.png 285w\" sizes=\"(max-width: 420px) 100vw, 420px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">Specify an appropriate passphrase to serve as your encryption\/decryption key, which can be an explicit string, environment variable, or the name of a secured file containing that string.<\/span><\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-10.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-13076 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-10.png\" alt=\"\" width=\"461\" height=\"346\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-10.png 773w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-10-300x225.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-10-768x577.png 768w\" sizes=\"(max-width: 461px) 100vw, 461px\" \/><\/a><\/p>\n<h4><span style=\"font-weight: 400;\">EmailsMatcher<\/span><\/h4>\n<p><span style=\"font-weight: 400;\">After finishing the previous dialog and creating our new KeyNameMatcher, we can add another Search Matcher for email addresses. Simply click <\/span><i><span style=\"font-weight: 400;\">Add <\/span><\/i><span style=\"font-weight: 400;\">to create another Search Matcher to add to the list.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The <\/span><i><span style=\"font-weight: 400;\">IRI Workbench <\/span><\/i><span style=\"font-weight: 400;\">comes preloaded with an <\/span><i><span style=\"font-weight: 400;\">EMAIL <\/span><\/i><span style=\"font-weight: 400;\">Data Class which can be selected by clicking on <\/span><i><span style=\"font-weight: 400;\">Browse <\/span><\/i><span style=\"font-weight: 400;\">next to the <\/span><i><span style=\"font-weight: 400;\">Data Class Name<\/span><\/i><span style=\"font-weight: 400;\"> field and selecting <\/span><i><span style=\"font-weight: 400;\">EMAIL <\/span><\/i><span style=\"font-weight: 400;\">from the dropdown menu.<\/span><\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-11.png\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-13077 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-11.png\" alt=\"\" width=\"434\" height=\"372\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-11.png 525w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-11-300x257.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-11-350x300.png 350w\" sizes=\"(max-width: 434px) 100vw, 434px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">For the Data Rule, you can select the FPE rule you have created for the previous Search Matcher by clicking on <\/span><i><span style=\"font-weight: 400;\">Browse <\/span><\/i><span style=\"font-weight: 400;\">next to the <\/span><i><span style=\"font-weight: 400;\">Rule Name <\/span><\/i><span style=\"font-weight: 400;\">field, or create a new one with one of the multiple masking functions available. I created a simple Data Redaction function which replaces the entire email with asterisks.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Your Search Matcher can now be added to the list by clicking on <\/span><i><span style=\"font-weight: 400;\">OK.<\/span><\/i><\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-12.png\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-13078 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-12.png\" alt=\"\" width=\"426\" height=\"401\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-12.png 525w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-12-300x282.png 300w\" sizes=\"(max-width: 426px) 100vw, 426px\" \/><\/a><\/p>\n<h4><span style=\"font-weight: 400;\">NamesMatcher<\/span><\/h4>\n<p><span style=\"font-weight: 400;\">Our last Search Matcher will be used to find names within free-flowing text. For this, we will use <\/span><i><span style=\"font-weight: 400;\">Named Entity Recognition (NER) <\/span><\/i><span style=\"font-weight: 400;\">to find names using the sentence\u2019s context. To begin, we need to click <\/span><i><span style=\"font-weight: 400;\">Add <\/span><\/i><span style=\"font-weight: 400;\">to create a new Search Matcher, and create a new Data Class called <\/span><i><span style=\"font-weight: 400;\">NAMES_NER:<\/span><\/i><\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-13.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-13079 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-13.png\" alt=\"\" width=\"449\" height=\"335\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-13.png 603w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-13-300x224.png 300w\" sizes=\"(max-width: 449px) 100vw, 449px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">To create a <\/span><i><span style=\"font-weight: 400;\">NAMES_NER <\/span><\/i><span style=\"font-weight: 400;\">Data Class, we first need to download the Person Name Finder model, <\/span><i><span style=\"font-weight: 400;\">en-ner-person.bin<\/span><\/i><span style=\"font-weight: 400;\">, from the OpenNLP <\/span><a href=\"http:\/\/opennlp.sourceforge.net\/models-1.5\/\"><span style=\"font-weight: 400;\">sourceforge repository<\/span><\/a><span style=\"font-weight: 400;\">. Then, click <\/span><i><span style=\"font-weight: 400;\">Add <\/span><\/i><span style=\"font-weight: 400;\">to add a new matcher, select <\/span><i><span style=\"font-weight: 400;\">NER Model<\/span><\/i><span style=\"font-weight: 400;\"> from the dropdown. Click <\/span><i><span style=\"font-weight: 400;\">Browse <\/span><\/i><span style=\"font-weight: 400;\">and navigate to the location of the downloaded model; for example:<\/span><\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-14.png\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-13080 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-14.png\" alt=\"\" width=\"465\" height=\"342\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-14.png 612w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-14-300x221.png 300w\" sizes=\"(max-width: 465px) 100vw, 465px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">After you have created the new Data Class, click <\/span><i><span style=\"font-weight: 400;\">OK <\/span><\/i><span style=\"font-weight: 400;\">and select the FPE Data Rule you defined previously to finish creating the Search Matcher:<\/span><\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-15.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-13081 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-15.png\" alt=\"\" width=\"416\" height=\"392\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-15.png 525w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-15-300x282.png 300w\" sizes=\"(max-width: 416px) 100vw, 416px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">Note that our <\/span><i><span style=\"font-weight: 400;\">NamesMatcher <\/span><\/i><span style=\"font-weight: 400;\">and <\/span><i><span style=\"font-weight: 400;\">KeyNameMatcher <\/span><\/i><span style=\"font-weight: 400;\">may have overlapping matches. If this happens, DarkShield selects the longest available match and removes any other overlapping matches. That way, you do not have to worry about DarkShield applying a masking function on already masked values.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Once you have added all the desired matchers, click finish to generate a <\/span><i><span style=\"font-weight: 400;\">.search<\/span><\/i><span style=\"font-weight: 400;\"> file.<\/span><\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-16.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-13082 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-16.png\" alt=\"\" width=\"440\" height=\"363\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-16.png 546w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-16-300x247.png 300w\" sizes=\"(max-width: 440px) 100vw, 440px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">The generated <\/span><i><span style=\"font-weight: 400;\">.search<\/span><\/i><span style=\"font-weight: 400;\"> file can be inspected to show the details about a search. This includes the source and target URIs, and information about all the matchers.<\/span><\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-17.png\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-13083 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-17.png\" alt=\"\" width=\"611\" height=\"472\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-17.png 789w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-17-300x232.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-17-768x594.png 768w\" sizes=\"(max-width: 611px) 100vw, 611px\" \/><\/a><\/p>\n<h2><span style=\"font-weight: 400;\">Step 2 &#8211; Conduct a Search (Create a <\/span><i><span style=\"font-weight: 400;\">.Darkdata<\/span><\/i><span style=\"font-weight: 400;\"> File)<\/span><\/h2>\n<p><span style=\"font-weight: 400;\">Completing the <\/span><i><span style=\"font-weight: 400;\">Dark Data Discovery Job <\/span><\/i><span style=\"font-weight: 400;\">wizard generates a new .<\/span><i><span style=\"font-weight: 400;\">search <\/span><\/i><span style=\"font-weight: 400;\">configuration file. That file contains the options we selected, including the source and target of our data, and the Search Matchers that will be used to tag PII for discovery, delivery, deletion, and\/or de-identification.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">To begin the search, right click on the<\/span><i><span style=\"font-weight: 400;\"> .search<\/span><\/i><span style=\"font-weight: 400;\"> file, select <\/span><i><span style=\"font-weight: 400;\">Run As,<\/span><\/i><span style=\"font-weight: 400;\"> and choose either<\/span><i><span style=\"font-weight: 400;\"> IRI Search Job<\/span><\/i><span style=\"font-weight: 400;\"> or <\/span><i><span style=\"font-weight: 400;\">IRI Search and Remediate Job<\/span><\/i><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><a href=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-18.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-13084 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-18-1024x768.png\" alt=\"\" width=\"608\" height=\"456\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-18-1024x768.png 1024w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-18-300x225.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-18-768x576.png 768w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-18.png 1105w\" sizes=\"(max-width: 608px) 100vw, 608px\" \/><\/a><\/p>\n<p><i><span style=\"font-weight: 400;\">Search<\/span><\/i><span style=\"font-weight: 400;\"> will only conduct a search, while <\/span><i><span style=\"font-weight: 400;\">Search and Remediate<\/span><\/i><span style=\"font-weight: 400;\"> will also attempt to mask (or delete) any identified data. Both will generate a <\/span><i><span style=\"font-weight: 400;\">.darkdata<\/span><\/i><span style=\"font-weight: 400;\"> file identifying any data of interest.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The source I used was populated with randomly generated values, so there is no harm in sharing the generated <\/span><i><span style=\"font-weight: 400;\">.darkdata<\/span><\/i><span style=\"font-weight: 400;\"> file here. However, when handling actually sensitive information, users should assure the <\/span><i><span style=\"font-weight: 400;\">.darkdata<\/span><\/i><span style=\"font-weight: 400;\"> file is not exposed and is safely archived or deleted after the completion of the remediation to prevent PII leakage. IRI will add a quarantine option for storing the <\/span><i><span style=\"font-weight: 400;\">.darkdata <\/span><\/i><span style=\"font-weight: 400;\">file and corresponding search artifacts in a safe location; contact <\/span><a href=\"mailto:darkshield@iri.com\"><span style=\"font-weight: 400;\">darkshield@iri.com<\/span><\/a><span style=\"font-weight: 400;\"> for details on this planned feature.<\/span><\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-19.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-13085 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-19.png\" alt=\"\" width=\"643\" height=\"568\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-19.png 789w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-19-300x265.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-19-768x678.png 768w\" sizes=\"(max-width: 643px) 100vw, 643px\" \/><\/a><\/p>\n<h2><span style=\"font-weight: 400;\">Step 3 &#8211; Remediation (Masking)<\/span><\/h2>\n<p><span style=\"font-weight: 400;\">Again, data masking or deletion can be performed during Search operations via the <\/span><i><span style=\"font-weight: 400;\">Search and Remediate<\/span><\/i><span style=\"font-weight: 400;\"> option in the Dark Data Discovery wizard. However, if you wish just to examine identified information and remediate it later, run the masking jobs from the<\/span><i><span style=\"font-weight: 400;\"> .darkdata<\/span><\/i><span style=\"font-weight: 400;\"> file produced in the Search (Step 2) this way:\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Right click on the <\/span><i><span style=\"font-weight: 400;\">.darkdata<\/span><\/i><span style=\"font-weight: 400;\"> file, mouse over <\/span><i><span style=\"font-weight: 400;\">Run As<\/span><\/i><span style=\"font-weight: 400;\">, and click <\/span><i><span style=\"font-weight: 400;\">IRI Remediate Job<\/span><\/i><span style=\"font-weight: 400;\">. Once the job runs, the remediated data should appear in the target database.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Here is an example showing a before and after of a small MongoDB database collection using the Workbench command prompt to access the local Mongo server:<\/span><\/p>\n<p><a href=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-20.png\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-13097 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-20-1024x405.png\" alt=\"\" width=\"698\" height=\"276\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-20-1024x405.png 1024w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-20-300x119.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-20-768x304.png 768w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-20.png 1194w\" sizes=\"(max-width: 698px) 100vw, 698px\" \/><\/a><\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<h4>CONCLUSION<\/h4>\n<p><span style=\"font-weight: 400;\">In this article we demonstrated new IRI capability to access unstructured data in Mongo databases, Cassandra keyspaces, and Elasticsearch Clusters using several Search Matchers in <\/span><a href=\"https:\/\/www.iri.com\/products\/darkshield\"><span style=\"font-weight: 400;\">IRI DarkShield<\/span><\/a><span style=\"font-weight: 400;\">. You can check the generated <\/span><i><span style=\"font-weight: 400;\">.darkdata<\/span><\/i><span style=\"font-weight: 400;\"> model to see the search results that were found and remediated, and check your database to see the updated tables\/collections.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Editor&#8217;s Note: As of 2024 and DarkShield V5, the search and masking methods for these NoSQL DBs has been upgraded; please refer to the discussion and demonstration in this updated article instead! This article demonstrates the use of IRI DarkShield to identify and remediate (mask) personally identifiable information (PII) and other sensitive data in MongoDB,<\/p>\n<div><a class=\"btn-filled btn\" href=\"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/\" title=\"Masking PII in MongoDB, Cassandra, and Elasticsearch with DarkShield: 4th IRI Method\">Read More<\/a><\/div>\n","protected":false},"author":121,"featured_media":13097,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"footnotes":""},"categories":[108,8,91,3,2255],"tags":[1430,1429,1428,510,1433,1386,20,1304,14,1469,1470,1388,850,1434,533,1273,1427,1431,1432],"class_list":["post-13061","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-big-data-2","category-data-protection","category-iri-workbench","category-vldb-operations","category-archived-articles","tag-cassandar-data-security","tag-cassandra-data-masking","tag-cassandra-database-masking","tag-cassandra-datastax","tag-dark-data-masking","tag-darkshield","tag-data-anonymization","tag-data-class","tag-data-masking","tag-elasticsearch","tag-elasticsearch-data-masking","tag-iri-darkshield","tag-iri-workbench","tag-masking-pii-in-mongodb","tag-mongodb","tag-mongodb-data-masking","tag-mongodb-database-masking","tag-mongodb-security","tag-unstructured-data-masking"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v23.4 (Yoast SEO v23.4) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Masking PII in MongoDB, Cassandra, and Elasticsearch with DarkShield: 4th IRI Method - IRI<\/title>\n<meta name=\"description\" content=\"This article demonstrates the use of IRI DarkShield to identify and remediate (mask) personally identifying information (PII) and other sensitive data\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Masking PII in MongoDB, Cassandra, and Elasticsearch with DarkShield: 4th IRI Method\" \/>\n<meta property=\"og:description\" content=\"This article demonstrates the use of IRI DarkShield to identify and remediate (mask) personally identifying information (PII) and other sensitive data\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/\" \/>\n<meta property=\"og:site_name\" content=\"IRI\" \/>\n<meta property=\"article:published_time\" content=\"2019-08-08T21:51:48+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-02-23T22:08:37+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-20.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1194\" \/>\n\t<meta property=\"og:image:height\" content=\"472\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"William Ulrich\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"William Ulrich\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"14 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/\"},\"author\":{\"name\":\"William Ulrich\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/person\/6f913f4d3a828eac88be5666ca709b1f\"},\"headline\":\"Masking PII in MongoDB, Cassandra, and Elasticsearch with DarkShield: 4th IRI Method\",\"datePublished\":\"2019-08-08T21:51:48+00:00\",\"dateModified\":\"2026-02-23T22:08:37+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/\"},\"wordCount\":2130,\"commentCount\":1,\"publisher\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-20.png\",\"keywords\":[\"cassandar data security\",\"cassandra data masking\",\"cassandra database masking\",\"Cassandra DataStax\",\"dark data masking\",\"DarkShield\",\"data anonymization\",\"data class\",\"data masking\",\"Elasticsearch\",\"Elasticsearch data masking\",\"IRI DarkShield\",\"IRI Workbench\",\"masking pii in mongodb\",\"MongoDB\",\"MongoDB data masking\",\"mongodb database masking\",\"mongodb security\",\"unstructured data masking\"],\"articleSection\":[\"Big Data\",\"Data Masking\/Protection\",\"IRI Workbench\",\"VLDB\",\"Archived Articles\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/\",\"url\":\"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/\",\"name\":\"Masking PII in MongoDB, Cassandra, and Elasticsearch with DarkShield: 4th IRI Method - IRI\",\"isPartOf\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-20.png\",\"datePublished\":\"2019-08-08T21:51:48+00:00\",\"dateModified\":\"2026-02-23T22:08:37+00:00\",\"description\":\"This article demonstrates the use of IRI DarkShield to identify and remediate (mask) personally identifying information (PII) and other sensitive data\",\"breadcrumb\":{\"@id\":\"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/#primaryimage\",\"url\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-20.png\",\"contentUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-20.png\",\"width\":1194,\"height\":472},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.iri.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Masking PII in MongoDB, Cassandra, and Elasticsearch with DarkShield: 4th IRI Method\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.iri.com\/blog\/#website\",\"url\":\"https:\/\/www.iri.com\/blog\/\",\"name\":\"IRI\",\"description\":\"Total Data Management Blog\",\"publisher\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.iri.com\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.iri.com\/blog\/#organization\",\"name\":\"IRI\",\"url\":\"https:\/\/www.iri.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"contentUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"width\":750,\"height\":206,\"caption\":\"IRI\"},\"image\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/person\/6f913f4d3a828eac88be5666ca709b1f\",\"name\":\"William Ulrich\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/e6705ed48fd0f8e0817ab1d89cefe9b0?s=96&d=blank&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/e6705ed48fd0f8e0817ab1d89cefe9b0?s=96&d=blank&r=g\",\"caption\":\"William Ulrich\"},\"url\":\"https:\/\/www.iri.com\/blog\/author\/williamu\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Masking PII in MongoDB, Cassandra, and Elasticsearch with DarkShield: 4th IRI Method - IRI","description":"This article demonstrates the use of IRI DarkShield to identify and remediate (mask) personally identifying information (PII) and other sensitive data","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/","og_locale":"en_US","og_type":"article","og_title":"Masking PII in MongoDB, Cassandra, and Elasticsearch with DarkShield: 4th IRI Method","og_description":"This article demonstrates the use of IRI DarkShield to identify and remediate (mask) personally identifying information (PII) and other sensitive data","og_url":"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/","og_site_name":"IRI","article_published_time":"2019-08-08T21:51:48+00:00","article_modified_time":"2026-02-23T22:08:37+00:00","og_image":[{"width":1194,"height":472,"url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-20.png","type":"image\/png"}],"author":"William Ulrich","twitter_card":"summary_large_image","twitter_misc":{"Written by":"William Ulrich","Est. reading time":"14 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/#article","isPartOf":{"@id":"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/"},"author":{"name":"William Ulrich","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/person\/6f913f4d3a828eac88be5666ca709b1f"},"headline":"Masking PII in MongoDB, Cassandra, and Elasticsearch with DarkShield: 4th IRI Method","datePublished":"2019-08-08T21:51:48+00:00","dateModified":"2026-02-23T22:08:37+00:00","mainEntityOfPage":{"@id":"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/"},"wordCount":2130,"commentCount":1,"publisher":{"@id":"https:\/\/www.iri.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/#primaryimage"},"thumbnailUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-20.png","keywords":["cassandar data security","cassandra data masking","cassandra database masking","Cassandra DataStax","dark data masking","DarkShield","data anonymization","data class","data masking","Elasticsearch","Elasticsearch data masking","IRI DarkShield","IRI Workbench","masking pii in mongodb","MongoDB","MongoDB data masking","mongodb database masking","mongodb security","unstructured data masking"],"articleSection":["Big Data","Data Masking\/Protection","IRI Workbench","VLDB","Archived Articles"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/","url":"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/","name":"Masking PII in MongoDB, Cassandra, and Elasticsearch with DarkShield: 4th IRI Method - IRI","isPartOf":{"@id":"https:\/\/www.iri.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/#primaryimage"},"image":{"@id":"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/#primaryimage"},"thumbnailUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-20.png","datePublished":"2019-08-08T21:51:48+00:00","dateModified":"2026-02-23T22:08:37+00:00","description":"This article demonstrates the use of IRI DarkShield to identify and remediate (mask) personally identifying information (PII) and other sensitive data","breadcrumb":{"@id":"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/#primaryimage","url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-20.png","contentUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-20.png","width":1194,"height":472},{"@type":"BreadcrumbList","@id":"https:\/\/www.iri.com\/blog\/data-protection\/mongodb-cassandra-darkshield\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.iri.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Masking PII in MongoDB, Cassandra, and Elasticsearch with DarkShield: 4th IRI Method"}]},{"@type":"WebSite","@id":"https:\/\/www.iri.com\/blog\/#website","url":"https:\/\/www.iri.com\/blog\/","name":"IRI","description":"Total Data Management Blog","publisher":{"@id":"https:\/\/www.iri.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.iri.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.iri.com\/blog\/#organization","name":"IRI","url":"https:\/\/www.iri.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","contentUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","width":750,"height":206,"caption":"IRI"},"image":{"@id":"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/person\/6f913f4d3a828eac88be5666ca709b1f","name":"William Ulrich","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/e6705ed48fd0f8e0817ab1d89cefe9b0?s=96&d=blank&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/e6705ed48fd0f8e0817ab1d89cefe9b0?s=96&d=blank&r=g","caption":"William Ulrich"},"url":"https:\/\/www.iri.com\/blog\/author\/williamu\/"}]}},"jetpack_featured_media_url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/08\/mongo-darkshield-20.png","_links":{"self":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts\/13061"}],"collection":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/users\/121"}],"replies":[{"embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/comments?post=13061"}],"version-history":[{"count":17,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts\/13061\/revisions"}],"predecessor-version":[{"id":17356,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts\/13061\/revisions\/17356"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/media\/13097"}],"wp:attachment":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/media?parent=13061"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/categories?post=13061"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/tags?post=13061"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}