OpenNTI Documentation¶
OpenNTI is a container packaged with all tools needed to collect and visualize time series data from network devices. Data can be collected from different sources:
- Data Collection Agent : Collect data on devices using CLI/Shell or Netconf
- Data Streaming Collector : Take all data streamed by Juniper devices as Input (JTI, Analyticsd, soon Openconfig with gRPC)
- Statsd interface : Accept any Statsd packets
It’s pre-configured with all tools and with a default dashboard .. Send it data, it will graph it
Thanks to docker, it can run pretty much anywhere on server, on laptop ... on the device itself
More detailed description of a project can be found here (including a series of videos on how to use it):
How to report feedback / participate in the project¶
For any issues please open an issue on Github. For comments, suggestions or questions please use our Google-Group
- To participate, please:
- Fork the project
- Send us a pull request
Note
if you are planning significant changes, please start a discussion first.
Contributions are more than Welcome
User Documentation¶
Additional Documentation¶
How to install or upgrade¶
Requirements¶
The requirements is to have docker and docker-compose installed on your Linux server/machine. Instructions to install are available below
How to Install/Start¶
OpenNTI is available on DockerCloud and this project provide scripts to easily download/start/stop it.
git clone https://github.com/Juniper/open-nti.git
cd open-nti
make start
Note
On Ubuntu, you’ll have to add “sudo” before the last command
By default it will start 3 containers and it’s working in non-persistent mode, once you stop it all data are gone.
It’s possible to start the main container in persistent mode to save the database outside the container, b
y using the startup script make start-persistent
.
Persistent mode on Mac OS requires at least v1.12
How to update¶
It’s recommended to upgrade the project periodically, both the files from github.com and the containers from Docker Hub. You can update easily with
make update
Customize OpenNTI¶
Customize container’s name and ports¶
All port numbers and names used by start/stop scripts are centralized in one file : open-nti.params, you can easily adapt this file with your own port numbers or names.
Note
It’s mandatory if you are planning to run multiple instances of OpenNTI on the same server.
Customize the container itself¶
If you want to make some modifications, you can always build the container yourself using the script make build
.
Note
The first time you run make build
, it will take 10-15min to download and compile everything but after that it will be very fast
Architecture description [WIP]¶
OpenNTI architecture is designed to be modular. the main components are a Timeserie Database(influxdb) and a graphical interface (grafana)
Based on the need, containers can be added or removed to add functionalities.
Docker compose¶
All containers are started using docker-compose.yaml
./docker.start.sh
You can create your own docker-compose file and pass it
./docker.start.sh <my docker compose file>
Data Collection Agent¶
Configuration¶
data/hosts.yaml In data/hosts.yaml you need to provide the list of devices you want to pull information from For each device, you need to indicate the name ane one or multiple tags (at least one). Tags will be used later to know which credentials should be used for this device and which commands need to be executed
<hostA>: <tag1> <tag4>
<hostB>: <tag1> <tag4>
<hostC>: <tag2> <tag4> <tag5>
<hostD>: <tag1> <tag4> <--- Those tags relate the Hosts with the credentials and the commands to use with
Example
mx-edge011: edge mx madrid bgp mpls
mx-agg011: agg mx madrid bgp isis
qfx-agg022: agg qfx munich bgp
qfx5100-02: tor qfx madrid isis
Note
The default configuration assume that hosts defined in hosts.yaml can be resolved with DNS if your hosts doesn’t have DNS entry, it’s possible to indicate the IP address in the hosts.yaml file instead of the name
192.168.0.1: edge mx madrid bgp mpls
To avoid using Ip addresses in the dashboard, you can use the device hostname defined in the configuration instead of the value define in hosts.yaml by setting the parameter use_hostname to true in open-nti.variables.yaml use_hostname: True
data/credentials.yaml
You need to provide at least one credential profile for your devices
jdi_lab:
username: '*login*' (Single quote is to force to be imported as string)
password: '*password*' (Single quote is to force to be imported as string)
method: password (other supported methods 'key' and 'enc_key' for ssh Key-Based Authentication)
key_file: ./data/*key_file* (optional: only appies if method key or enc_key is used, it must be located at data directory)
tags: tag1 tag2
data/commands.yaml
generic_commands: <--- You can name the group as best fits you
commands: |
show version | display xml <--- There is no limit on how many commands can be added into a group
show isis statistics | display xml <-- Before adding a command, confirm that there is a related parser
show system buffers
show system statistics icmp | display xml
show route summary | display xml
tags: tag1 tag2
Execution periodic¶
To collect data periodically with the Data Collection Agent, you need to setup a cron job inside the container. As part of the project, open-nti is providing some scripts to easily add/remove cron jobs inside the container from the host.
- Scripts provided:
- make cron-add: Create a new cron job inside the container
- make cron-show: Show all cron jobs configured inside the container
- make cron-delete: Delete a cron job inside the container for a specific tag
To start cron job to execute commands specified above for specific tag every minute:
make cron-add TAG=lab
To start cron job for more than one tag at the same time:
make cron-add TAG='lab prod'
To start cron job to execute commands specified above for specific tag every 5 minutes:
make cron-add TAG=tag1 TIME=5m
To start cron job to execute commands specified above for specific tag every hour:
make cron-add TAG=tag1 TIME=1h
To stop cron job for specific tag:
make cron-show TAG=tag1
Note
If you want to configure the cron job yourself, open-nti use this command:
/usr/bin/python /opt/open-nti/open-nti.py -s --tag <tag>
Junos Parsers¶
Parser | Description | Author |
---|---|---|
show-services-l2tp-summary.parser.yaml | None | anomymous |
show-bgp-summary.parser.yaml | None | anomymous |
show-pppoe-statistics.parser.yaml | None | anomymous |
rtsockmon.parser.yaml | None | anomymous |
show-network-access-aaa-radius-servers-detail.parser.yaml | None | anomymous |
show-isis-statistics.parser.yaml | None | anomymous |
show-subscribers-summary.parser.yaml | None | anomymous |
show-bfd-session-summary.parser.yaml | None | anomymous |
show-services-nat-pool-detail.parser.yaml | None | anomymous |
show-chassis-routing-engine.parser.yaml | None | anomymous |
show-firewall.parser.yaml | None | anomymous |
show-interfaces-media.parser.yaml | None | anomymous |
show-task-accounting.parser.yaml | None | anomymous |
show-subscribers-summary-port.parser.yaml | None | anomymous |
show-task-io.parser.yaml | None | anomymous |
show-services-stateful-firewall-flow-analysis.parser.yaml | None | anomymous |
show-bgp-neighbor-10.255.0.206.parser.yaml | None | anomymous |
show-network-access-aaa-statistics-address-assignment-pool.parser.yaml | None | anomymous |
show-system-resource-monitor-summary.parser.yaml | None | anomymous |
show-pfe-statistics-traffic.parser.yaml | None | anomymous |
show-system-statistics-icmp.parser.yaml | None | anomymous |
show-system-processes-extensive.parser.yaml | None | anomymous |
show-mpls-lsp.parser.yaml | None | anomymous |
show-system-virtual-memory.parser.yaml | None | anomymous |
show-services-stateful-firewall-subscriber-analysis.parser.yaml | None | anomymous |
show-services-video-monitoring-mdi-flow-fpc-slot-1.parser.yaml | None | anomymous |
show-version.parser.yaml | None | anomymous |
show-route-summary.parser.yaml | None | anomymous |
show-services-rpm-probe-results.parser.yaml | None | anomymous |
show-snmp-statistics.parser.yaml | None | anomymous |
show-system-buffers.parser.yaml | None | anomymous |
Data Streaming Collector¶
- Currently the collector accept:
- Analyticsd (QFX5k) streams in JSON/UDP on port UDP/50020
- Juniper Telemetry Interface (MX/PTX) streams in GPB/UDP on port UDP/50000
Important
it’s important that all devices have the correct time defined, it’s recommended to configure NTP everywhere
statsd interface¶
open-nti is using telegraf to support statsd Statsd is a popular tool to send metrics over the network, it has been designed by etsy.
More information below:
Here is an example of how to insert statsd data into the Database
root@d3e82264a08b:/# echo "opennti,device=qfx5100,type=int.rx:100|g" | nc -w 1 -u 127.0.0.1 8125
opennti define the serie device=qfx5100,type=int.rx will be converted as tag1 100 is the value g indicate gauge
Events¶
By default dashboards are configured to display some “events” that are stored in the database into the serie “events” Their are multiple ways to record entry in the events serie
Insert events via syslog¶
open-nti will access events in the syslog format on port UDP/6000. The goal is not to send all syslog but only relevant information like Commit or Protocol Flaps
To send only one syslog at commit time you can use the configuration below
set system syslog host 192.168.99.100 any any
set system syslog host 192.168.99.100 match UI_COMMIT_COMPLETED
set system syslog host 192.168.99.100 port 6000
Insert events in the database directly¶
It’s possible to insert events with just a HTTP POST request to the database, here is an example using curl
curl -i -XPOST 'http://10.92.71.225:8086/write?db=juniper' --data-binary 'events,type=Error text="BGP Flap"'
curl -i -XPOST 'http://10.92.71.225:8086/write?db=juniper' --data-binary 'events,device=qfx5100-01,type=Commit text="Change applied"'
Note
any system that knows how to generate a HTTP POST request can inject an event. its very utile if you have a script/tool that run some tests to keep track of when major events happen
Dashboard Generator¶
OpenNTI integrate a Dashboard generator based on Python and Jinja2.
- This dashboard generator can be very useful in many situations:
- Convert JTI graphs to the new variables names
- Create graphs for the new JTI sensors (LSP, FW etc ..)
- Add templating for interface
- Create Dashboard for Netconf in mode 2 & 3
- Create Dashboard on demand and more personalized
In a nutshell, it templatized a grafana dashboard into multiple pieces:
- The skeleton of the dashboard:
- Rows, composed of multiple panels or graphs
- Graphs,
- Annotations, events overlay on the graphs
- Templatings, drop down menu to narrow the scope
To generate a dashboard you need to create a yaml file that indicate: the title, which rows, which annotations etc ..
title: Data Steaming Collector ALPHA
template: "dashboard_base.j2"
tags:
- opennti
rows:
- int-traffic.yaml
- int-queue.yaml
- int-buffer.yaml
templatings:
- host_regex.yaml
- interface.yaml
annotations:
- commit.yaml
- bgp_state.yaml
To generate the dashboard based on this config file, you just have to call this command line
cd dashboards/
python gendashboard.py --file data_streaming_collector.yaml
The rows are defined in the directory templates/rows/
and the graphs in the directory templates/graphs/
The idea is to define which template for each configuration file, so we don’t need to turn everything into a variable in the templates.
If 2 graphs are very different we can just have different templates.
It will keep the YAML file light and easily readable.
Note
You can browse all rows, graphs, templatings and annotations available in the Dashboard Library
Dashboard Library¶
Graphs¶
Graph | File |
---|---|
DCD SIZE | re-memory-dcd.yaml |
Broadcast/Multicast/Unicast Interface Traffic | int-pps-ucast-bcast-mcast.yaml |
RPM Probes | rpm-probes.yaml |
Buffer Latency | int-buffer-latency.yaml |
Firewall Filter Memory Usage | fw-memory.yaml |
Aggregated TX/RX (PPS) | int-pps-aggr.yaml |
SMI-HELPER SIZE | re-memory-smihelperd.yaml |
RE - CPU idle | re-memory-utilization.yaml |
Buffer Utilization | int-buffer-size.yaml |
ISIS Total packets-processed (delta) | isis-packet-total-processed.yaml |
ISIS LSP regenerations (delta) | isis-lsp-regeneration-run.yaml |
COSD SIZE | re-memory-cosd.yaml |
Firewall Filter Counter - Bytes | fw-counter-bytes.yaml |
SMID SIZE | re-memory-smid.yaml |
ISIS Total packets-sent (delta) | isis-packet-total-sent.yaml |
Traffic Statistics (PPS) | int-pps.yaml |
ISIS Total packets-dropped (delta) | isis-packet-total-dropped.yaml |
SNMPD SIZE | re-memory-snmpd.yaml |
ISIS fragments-rebuilt (delta) | isis-fragment-rebuilt.yaml |
ISIS Total packets-received (delta) | isis-packet-total-received.yaml |
RPD SIZE | re-memory-rpd.yaml |
Firewall Filter Counter - Packets | fw-counter-packets.yaml |
Traffic Statistics (BPS) | int-bps.yaml |
MIB2D SIZE | re-memory-mib2d.yaml |
DFWD SIZE | re-memory-dfwd.yaml |
CPU Memory - Size | cpumem-size.yaml |
CPU Memory - Bytes allocated | cpumem-bytes.yaml |
ISIS spf-runs (delta) | isis-spf-run.yaml |
BGP peer-count | bgp-peer-count.yaml |
Interface Queue Stats | int-queue-stat.yaml |
CPU Memory - Utilization | cpumem-utilization.yaml |
LSP Traffic Rate (BPS) | mpls-lsp-traffic-bps.yaml |
Traffic Statistics (BPS) | jti-oc-int-bps.yaml |
Traffic Packet Count | int-packets.yaml |
Queue Traffic Statistics (BPS) | jti-oc-queue-bps.yaml |
ISIS Total packets-retransmitted (delta) | isis-packet-total-retransmit.yaml |
LSP Traffic Rate (PPS) | mpls-lsp-traffic-pps.yaml |
Interface Error Statistics | int-error.yaml |
BGP down-peer-count | bgp-peer-down-count.yaml |
RE - CPU idle | re-cpu-idle.yaml |
SAMPLED SIZE | re-memory-sampled.yaml |
BGP group-count | bgp-group-count.yaml |
Rows¶
Rows | File |
---|---|
RPM Probes | rpm-probes.yaml |
MPLS LSP | mpls-lsp.yaml |
CPU / Memory | cpumem.yaml |
Interfaces statistics | int-traffic.yaml |
BGP Statistics | protocol-bgp.yaml |
ISIS Statistics | protocol-isis.yaml |
Interfaces Queue | int-queue.yaml |
Queue Stats | qfx-queue.yaml |
Interfaces Buffer | int-buffer.yaml |
RE Processes | re-processes.yaml |
Firewall Filters | firewall.yaml |
RE Statistics | re-statistics.yaml |
Interfaces Traffic Packets | int-traffic-packets.yaml |
Disclaimer | opennti-disclaimer.yaml |
Annotations¶
Annotations | File |
---|---|
Interface Flap | interface_flap.yaml |
BGP Flap | bgp_state.yaml |
LDP Down | ldp_down.yaml |
Commit | commit.yaml |
Templatings¶
Annotations | File |
---|---|
Interface Flap | interface_flap.yaml |
BGP Flap | bgp_state.yaml |
LDP Down | ldp_down.yaml |
Commit | commit.yaml |
Troubleshooting Guide¶
To check if containers are running, execute the following command. By default you should have 3 containers running
docker ps
To force containers to stop, execute
make stop
To access the CLI of the main container for debug, Start a SSH session using the insecure_key provided in the repo and the script “docker.cli.sh”
make cli
For the Input containers named __open-nti-input-*__ you can access the logs directly from docker by running :
docker logs <container name or ID>
Data Collection Agent¶
Q - I configured hosts/credential/commands.yaml files but I’m not seeing anything on the dashboard¶
To make sure everything is working as expected, you can run the Data Collection Agent in debug mode
make cron-debug TAG=lab
Data Streaming Collector¶
Q - I’m streaming data from devices but I’m not seeing anything on the Dashboard¶
To reach the dashboard, traffic have to go through the following path: Device >(A)> Host >(B)> Container >(C)> Fluentd >(B)> InfluxDB >(E)> Grafana
A - Check the timestamp on the devices and on the server
Timestamp MUST match on both side, the server and the junos devices. It’s the most common issue.
B - Check that traffic is reaching the Host
The best solution is to use TCPDUMP on the Host and filter on destination port
On Unix/Mac
tcpdump -i <ingress interface> -n dst port <dest port number>
C - Check that traffic is reaching the container
The best solution is to use TCPDUMP inside the container
./docker.cli.sh
tcpdump -i eth0 -n dst port <dest port number>
RPF check might be a problem if you see incoming packets in A but not in B.
If you e.g. use Src IP for which there is no route entry on host OS (Ubuntu
does RPF check as default), packets would be discarded.
D - Check Fluentd**
Check fluentd logs, inside the container
docker logs opennti_input_jti
Nothing should be printed if everything is right
E - Check if data is properly reaching the database
- connect on Influxdb management interface with a browser on port 8083
- Select Juniper as database on top right corner
- Run query
`show measurements`
to see what is present- Execute query for
`SELECT * FROM "<measurements>"`
Note
- Destination tables will vary depending of the incoming traffic
- For MX > jnpr.jvision
- For QFX5100/EX4300 > jnpr.analyticsd