Use this process if you would like full control of your backups, otherwise I advise you use the NRT Wrapper Method for an automated approach, - Centralized Backup & Restore of NetWitness Version 11.2+ (A Wrapper Script for NRT)
Changes are inevitable and no one knows when a restore is going to be needed. Today backup and restore processes are standard, required, and are part of nearly all basic deployment strategies. With that in mind, I was tasked/required to backup a environment of 32 hosts and running NRT manually was not a option, so I had to try to automate it.
Scenario -
Regular backups from all the netwitness hosts throughout the environment in the event of a problem, emergency, or a upgrade.
Problem -
NRT is a manual process out of the box and requires a significant amount of time to process (run the script and relocate the backup files).
Reference:
Resolution -
Through some trial and error I built a script to automate the process that can be controlled with a cron job. The issue was not NRT per se, it was pulling the backup files to a central location. I initially thought chef may be able to pull these items back but unfortunately, pulling was not something Chef did well. I did experiment with some Chef solutions but none worked the way I wanted them to. So, I went back to old school private and public keys (because of scp interactive) to move the backups to secondary ESA - for storage only. The next issue I came across with NRT was that it was device specific (--category Decoder). This was extremely problematic due to the automation.
Solution -
- Get the keys situated (interactive responses make this step necessary)
- On the backup host (where all the backups will reside)
- cd ~/.ssh
- ssh-keygen -t rsa -b 2048 -N "" -f ~/.ssh/nwbackup_key
- Copy the public key file to the SA Server
- On the SA Server
- Put the public nwbackup_key.pub in /tmp
- salt '*' cmd.run 'mkdir ~/.ssh' (creates the ssh directory on all netwitness hosts)
- salt '*' cmd.run 'chmod 700 ~/.ssh' (changes the permission of the ssh directory on all the netwitness hosts)
- salt-cp '*' file.copy /tmp/nwbackup_key.pub /tmp/nwbackup_key.pub
- salt '*' cmd.run 'chmod 755 /tmp/nwbackup_key.pub'
- salt '*' cmd.run 'cat /tmp/nwbackup_key.pub >> ~/.ssh/authorized_keys'
- salt '*' cmd.run 'rm -f /tmp/nwbackup_key.pub'
- Check to make sure the public key has been saved in the authorized_keys file on the hosts
- Backup server
- Test the key by ssh root@host from the backup host server to one of the servers being backed up.
Command for SA Server (Backup)
nw-recovery-tool --export --dump-dir /var/netwitness/nw-recovery-tool_backup --category AdminServer
Commands for (Backup)
nw-recovery-tool --export --dump-dir /var/netwitness/nw-recovery-tool_backup --category NetworkHybrid
nw-recovery-tool --export --dump-dir /var/netwitness/nw-recovery-tool_backup --category Decoder
nw-recovery-tool --export --dump-dir /var/netwitness/nw-recovery-tool_backup --category Concentrator
nw-recovery-tool --export --dump-dir /var/netwitness/nw-recovery-tool_backup --category Broker
- Other host types: Archiver, Broker, Concentrator, Decoder, EndpointHybrid, EndpointLogHybrid, ESAPrimary, ESASecondary, LogCollector, LogDecoder, LogHybrid, Malware, NetworkHybrid, UEBA
- Create a script for each host type in your environment
- Create a script for the backup server (in my case it is a Secondary ESA)
- On the backup host (where all the backups will reside)
Summary:
- Generated public/private key for the backup host
- Distributed the public key via salt on the SA Server
- Validated shh non-interactive connectivity between the hosts from the backup server
- Created a backup script for each host type (broker, decoder, etc.)
- Distributed the host scripts via salt on the SA Server
- Created a script on the backup server to run the host scripts and then scp them back to the backup server.
In the end, my solution ended up having 8 scripts (same script with edits to the device type by services and category). The goal was to put all the backup files into one place so I could copy them and put them in a safe place. Please keep in mind, this is just one way of many possibilities to centralize backups and it is a solution I chose based on the environment I was working on.
Let me know your thoughts.
Tom J
This is great, it has some really good ideas - I've never tried to use chef to distribute keys to the other hosts.
Not necessarily, you can get that information from the Admin servers mongo database using this:
mongo admin -u deploy_admin -p <deployPW> --eval "db=db.getSiblingDB('orchestration-server');db.getCollection('host').find({}, { _id: 0, _class: 0, version: 0, thirdParty: 0, meta: 0}).forEach(function(f){print(tojson(f, '', true));})"
-where the <deployPW> is your deployment password. This will generate an output like this:
{ "hostname" : "192.168.1.1", "displayName" : "NW11-2-SA", "installedServices" : [ "AdminServer" ] }
{ "hostname" : "192.168.1.2", "displayName" : "NW11-2-ESA", "installedServices" : [ "ESAPrimary" ] }
{ "hostname" : "192.168.1.3", "displayName" : "NW11-2-LDEC", "installedServices" : [ "LogDecoder" ] }
{ "hostname" : "192.168.1.4", "displayName" : "NW11-2-LCON", "installedServices" : [ "Concentrator" ] }
From this point, you can use 'cut' to get the last field (the service) for each host, and use it dynamically when making the commands for the hosts.
One more idea worth considering:
A CEF output can be added in case a backup failed, that way you can create alerts with mail notifications. Just send a log using the proper format to your designated logdecoder:
logger -n <LOGDECODER_IP> -P 514 -t "$(hostname)" "CEF:0|RSA|NetWitness Audit|11.2.0.0|BACKUP_FALIURE|Failed to backup component, check the logs|6|rt=$(date "+%b %d %Y %H:%M:%S") suser=admin sourceServiceName=SA_SERVER deviceProcessName=SA_SERVER outcome=Faliure"
Of course you should customize this line to represent your deployment (version number, service name, log message, etc.). Also, firewall rules might be required between components.
One question: how did you backup services with deploy passwords (Adminserver, ESA)? For me this seemed to be the biggest issue, because I have to hardcode passwords in the scripts somehow.