Tags: regular expression, regex, regex and remove, regex and replace, regex exact value and replace, regular expression to print all value, sed search and replace, sed remove, sed replace regex, sed replace
regular expression - regex exact value and replace |
Introduction :
To remove the personal information from the json output logs using the sed command and share with third party vendor for troubleshooting the issue.
In this tutorial we will learn blow points
1 - how to remove personals information of customer from the log file ?
2 - how to mask PII data in json logs output ?
3 - how to masked personal information in file and share with vendor ?
4 - how to regex json value and replace ?
5 - grep value and remove from file
6 - how to use regular expression in linux
We we need to remove Personals data ?
Suppose you we working on an application, while working we got an error and we tried to resolve but cant, so we need application vendor help and have to involve.
Vendor need logs to understanding the issue, here is a big challenge to share the logs direclty because many organization did not allow to share customer information to third party/vendor.
So will masked the personal data from logs and share with vendor, for the single or double entity we can do it by manually but multiple or big logs file we can not do it easily means it is too time consuming..
OS / Tools / Command
1 - Linux
2 - Bash
3 - grep
4 - sed
5 - logs file ( json logs )
Step 1:
Copy the logs file, which content a personal information for removing, in this tutorial we have log file in blow directory and file name
log file path : /var/log/event.log
cp -rv /var/log/event.log /tmp/
review logs file and notedown which value you want to remove
cat /var/log/event.log
In the above we have some PI Field and can not share this info outside of organization line, like
- accountNumber
- operator
- custName
- custID
- cardHolder
- card
- Phone
- Address
We will pick one value and print using grep command, like accountNumber, in this field we have only numeric value / digit value / number value.
"accountNumber":"350211212093"
we will grep only accountNumber first in above logs, below grep command will print only match word
grep -o "accountNumber" /tmp/event.log
Now we will grep ":"
grep -o "accountNumber\":\"" /tmp/event.log
In the above example we use backslash to escape double quote because we have double quote after the number and then colon (:) and again backslash double quote.
Note: We used backslash to escape any special character in grep
In the next step we will use regular expression to grep number/digit value, we used [0-9] regex in below command with grep.
grep -o "accountNumber\":\"[0-9]" /tmp/event.log
Above example printed a only one digit. we will use ( + ) plus to print all digit between double quote.
grep -o "accountNumber\":\"[0-9]\+" /tmp/event.log
In the final step we will print last double quote
grep -o "accountNumber\":\"[0-9]\+\"" /tmp/event.log
So we printed actual value from the logs file, now we will use sed command to replace or remove the account number.
sed 's/accountNumber\":\"[0-9]\+/accountNumbter\":\"REMOVED/g' /tmp/event.log
We removed account number value using sed in dry run of sed command, we can replace using below command
sed -i 's/accountNumber\":\"[0-9]\+/accountNumbter\":\"REMOVED/g' /tmp/event.log
Value B - alphanumeric and special character
We will choose a address from the logs file below the address have charator, numbere and special character
"Address":"235/2 Street ll PBL"
We will use regular expression in square brackets below
a-z - lower case character
A-Z - upper case character
0-9 - digit
/ -_@, - special character
[a-zA-Z0-9\s /_-,]\+
grep -o "Address\":\"[a-zA-Z0-9\s /_-,]\+\"" /tmp/event.log
We got the address value, now again we will use sed command
sed -e 's/Address\":\"[a-zA-Z0-9\s /_-,]\+/Address\":\"REMOVED/g' /tmp/event.log
Replace address value :
sed -i 's/Address\":\"[a-zA-Z0-9\s /_-,]\+/Address\":\"REMOVED/g' /tmp/event.log
Note: As my suggestion we can use [a-zA-Z0-9\s /_-,]\+ regular expression for replace all above fields, it does not matter character / number / special present in value or not.
Script :
1 - Create a script file as you wish with below content
vi /tmp/mask.sh
# In the script we create a backup first in /tmp directory and then we used a for loop using a sed command.
#!/bin/bashcp -rv /var/log/event.log /tmp/event.logfor i in accountNumber custID Address operator custName cardHolder card Phone Addressdosed -i "s/$i\":\"[a-zA-Z0-9/ _-,@*[\s]\+/$i\":\"REMOVED/g" /tmp/event.logdone
Visit Method 2 - Click Here
https://www.linuxtopic.com/2021/05/02-regular-expression-regex-replace-all.html
I hope this topic gave you all the information you needed. If you have any further questions or would like more detailed directions feel free to contact us using any of the following sources. We look forward to talking to you.
Nice Article.
ReplyDeleteWhen your website or blog goes live for the first time, it is exciting. That is until you realize no one but you and your. file upload
ReplyDelete