Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

default route in "vpn" mode is incorrect on FreeBSD #580

Closed
dch opened this issue Sep 8, 2017 · 6 comments
Closed

default route in "vpn" mode is incorrect on FreeBSD #580

dch opened this issue Sep 8, 2017 · 6 comments
Labels
Type: Bug Bug to be resolved

Comments

@dch
Copy link
Contributor

dch commented Sep 8, 2017

setup

  • FreeBSD 12.0-CURRENT amd64 & zerotier 1.2.4 installed

  • 1 zt network with allowDefault=0 connecting to a working ZT VPN gateway

  • iphone, imac work via this vpn gateway as default route

  • connecting successfully to various hosts

  • when zerotier-cli set 12345678 allowDefault=1 is run, all network connectivity to/from laptop is lost (no ping in/out from local network nor zt ones, external devices like firewalls also cannot connect

  • when this is reversed the system returns to normal

# allowdefault=0 (off/working zt config)
# netstat -rn4

Routing tables

Internet:
Destination        Gateway            Flags     Netif Expire
default            172.16.2.1         UGS       wlan0
10.144.0.0/16      link#3             U      zt1flo98
10.144.49.109      link#3             UHS         lo0
127.0.0.1          link#1             UH          lo0
172.16.2.0/24      link#2             U         wlan0
172.16.2.15        link#2             UHS         lo0
# allowdefault=1 (on/broken zt config)
# netstat -rn4

Routing tables

Internet:
Destination        Gateway            Flags     Netif Expire
0.0.0.0/1          10.144.0.1         UGS    zt1flo98
default            172.16.2.1         UGS       wlan0
10.144.0.0/16      link#3             U      zt1flo98
10.144.49.109      link#3             UHS         lo0
127.0.0.1          link#1             UH          lo0
128.0.0.0/1        10.144.0.1         UGS    zt1flo98
172.16.2.0/24      link#2             U         wlan0
172.16.2.15        link#2             UHS         lo0

more logs & debugging or remote access available as needed.

@dch
Copy link
Contributor Author

dch commented Sep 11, 2017

So today I tried adding the 2 routes that @adamierymenko mentioned in a previous issue, but manually.

zt1flo98dm0peka: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 5000 mtu 2800
	options=80000<LINKSTATE>
	ether 8a:58:82:3c:2f:91
	hwaddr 00:bd:75:ec:52:09
	inet 10.244.23.143 netmask 0xffff0000 broadcast 10.244.255.255 
	inet6 fc7b:dbb3:c9e2:8e50:6c98::1 prefixlen 40 
	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
	media: Ethernet autoselect
	status: active
	groups: tap 
	Opened by PID 75070

root@wintermute /t/zt# netstat -rn4 > before.lst
root@wintermute /t/zt# route add 0.0.0.0/1 10.244.0.1
add net 0.0.0.0: gateway 10.244.0.1
root@wintermute /t/zt# route add 128.0.0.0/1 10.244.0.1
add net 128.0.0.0: gateway 10.244.0.1
root@wintermute /t/zt# netstat -rn4 | tee after.lst
Routing tables

Internet:
Destination        Gateway            Flags     Netif Expire
0.0.0.0/1          10.244.0.1         UGS    zt1flo98
default            172.16.1.1         UGS        igb0
10.144.0.0/16      link#6             U      zt1flo98
10.144.0.2         link#6             UHS         lo0
10.241.0.0         link#4             UH          lo1
10.241.0.1         link#4             UH          lo1
10.241.0.2         link#4             UH          lo1
10.241.0.3         link#4             UH          lo1
10.241.0.4         link#4             UH          lo1
10.241.0.5         link#4             UH          lo1
10.241.0.6         link#4             UH          lo1
10.241.0.7         link#4             UH          lo1
10.241.0.8         link#4             UH          lo1
10.241.0.9         link#4             UH          lo1
10.241.0.10        link#4             UH          lo1
10.241.0.11        link#4             UH          lo1
10.241.0.12        link#4             UH          lo1
10.241.0.13        link#4             UH          lo1
10.241.0.14        link#4             UH          lo1
10.241.0.15        link#4             UH          lo1
10.244.0.0/16      link#5             U      zt1flo98
10.244.23.143      link#5             UHS         lo0
127.0.0.1          link#3             UH          lo0
128.0.0.0/1        10.244.0.1         UGS    zt1flo98
172.16.1.0/24      link#1             U          igb0
172.16.1.14        link#1             UHS         lo0
192.168.0.0        link#2             UHS         lo0
192.168.0.0/16     link#2             U          igb1



root@wintermute /t/zt# patdiff before.lst after.lst 
------ before.lst
++++++ after.lst
@|-1,10 +1,11 ============================================================
 |Routing tables
 |
 |Internet:
 |Destination        Gateway            Flags     Netif Expire
+|0.0.0.0/1          10.244.0.1         UGS    zt1flo98
 |default            172.16.1.1         UGS        igb0
 |10.144.0.0/16      link#6             U      zt1flo98
 |10.144.0.2         link#6             UHS         lo0
 |10.241.0.0         link#4             UH          lo1
 |10.241.0.1         link#4             UH          lo1
 |10.241.0.2         link#4             UH          lo1
@|-21,12 +22,13 ============================================================
 |10.241.0.13        link#4             UH          lo1
 |10.241.0.14        link#4             UH          lo1
 |10.241.0.15        link#4             UH          lo1
 |10.244.0.0/16      link#5             U      zt1flo98
 |10.244.23.143      link#5             UHS         lo0
 |127.0.0.1          link#3             UH          lo0
+|128.0.0.0/1        10.244.0.1         UGS    zt1flo98
 |172.16.1.0/24      link#1             U          igb0
 |172.16.1.14        link#1             UHS         lo0
 |192.168.0.0        link#2             UHS         lo0
 |192.168.0.0/16     link#2             U          igb1

NB this is a different box (ethernet, no wifi, 2 NICs, 2 zt networks, 10.241.x assigned to jails) but I get the same result.

It's not clear to me yet whether this is a FreeBSD-specific issue in zerotier, or zerotier using FreeBSD routing incorrectly, or something else entirely. Apologies if I'm barking up the wrong tree.

@dch
Copy link
Contributor Author

dch commented Sep 11, 2017

also huge thanks to @dharrigan for working through this with me!

@kkdzo
Copy link

kkdzo commented Sep 15, 2017

I'm quite interested in seeing this fixed as well, in the mean time I got a workaround working by using multiple routing tables.

Since the usual routing table change doesn't work in FreeBSD (and OpenBSD), I looked into using a separate routing table for zerotier.
The idea is to have one table for zerotier only which has a default route to a regular gateway and another one for the rest of the system and applications, this one pointing to the zerotier gateway.

In order to use multiple routing tables, net.fibs=2 needs to be added to /boot/loader.conf (2 is the number of routing tables needed, up to 16). A reboot is required after making this change.
FreeBSD allows us to tell a process to use a specific routing table with the setfib utility.
I wrote a small rc.d script that takes care of the routing tables changes at boot time and also starts zerotier.
This replaces the rc.d script that comes packaged with zerotier.

The steps in the script are:

  • Add the regular default route taken from the first / default routing table to the second one.
    This second routing table is only used by zerotier and will allow it to reach the outside world.
  • Start zerotier and tell it to use the second routing table
    Upon automatically joining the network we will receive the managed routes defined for that network.
    Wait 3 seconds to allow for zerotier to do what it has to do.
  • Grab the gateway for default route out of the zerotier managed routes and make it the next hop in the default routing table.
    I used jq to extract the ip from the json output, this needs to be installed.
    This effectively forces any process using the default routing table (all except zerotier-one) to go through zerotier

Since zerotier is started inside the script, it should not be enabled on its own, my /etc/rc.conf has the following entries:

zerotier_enable="NO"
ztroute_enable="YES"

And the script itself that I put in /usr/local/etc/rc.d/ztroute, with executable permission.

#!/bin/sh

# $FreeBSD$
#
# PROVIDE: ztroute
# REQUIRE: zerotier
#
# Add these lines to /etc/rc.conf.local or /etc/rc.conf
# to enable this service:
#
# ztroute_enable (bool):        Set to NO by default.
#                               Set it to YES to enable zerotier.

. /etc/rc.subr

name=ztroute
rcvar=ztroute_enable

load_rc_config $name

: ${ztroute_enable:="NO"}

start_cmd="ztroute_start"

ztroute_start()
{
        echo "Adding default route to fib 1"
        setfib 1 route add default $(netstat -rnf inet |grep default | awk '{ print $2}')
        echo "Starting Zerotier"
        setfib 1 /usr/local/sbin/zerotier-one -d
        sleep 3
        echo "Replacing fib 0 default route with ZT route"
        setfib 0 route change default $(/usr/local/bin/zerotier-cli listnetworks -j |/usr/local/bin/jq '.[].routes[] | select(.target=="0.0.0.0/0") | .via' | sed -e 's/"//g')
}

run_rc_command "$1"

This can obviously be improved in many ways like adding logic to disable and re-enable the route or having a cron entry that checks if the managed route has changed and update the table accordingly but for now it works for me.

One last thing, sshd should be started after this script is done, without that, it will not bind to the zerotier interface and sshing to the zerotier ip will not be possible.

Hope this helps.

@janjaapbos
Copy link
Contributor

Thank you very much @kkdzo for nailing this down. Since I don't use freebsd I'll set up a vm with freebsd and test it myself as well.

@dch
Copy link
Contributor Author

dch commented Sep 16, 2017

@kkdzo .a.w.e.s.o.m.e. with your permission I'll roll these changes into the port itself. Let me know how you'd like to be credited.

@dharrigan
Copy link

I look forward to this as well :-)

@adamierymenko adamierymenko added the Type: Bug Bug to be resolved label Oct 16, 2017
laduke added a commit that referenced this issue Jul 13, 2023
It doesn't work.
Not possible to fix with deficient network
stack and APIs.

ZeroTierOne-freebsd # zerotier-cli set 9bee8941b5xxxxxx allowDefault=1
400 set Allow Default does not work properly on FreeBSD. See #580
root@freebsd13-a:~/ZeroTierOne-freebsd # zerotier-cli get 9bee8941b5xxxxxx allowDefault
1
laduke added a commit that referenced this issue Jul 13, 2023
It doesn't work.
Not possible to fix with deficient network
stack and APIs.

ZeroTierOne-freebsd # zerotier-cli set 9bee8941b5xxxxxx allowDefault=1
400 set Allow Default does not work properly on FreeBSD. See #580
root@freebsd13-a:~/ZeroTierOne-freebsd # zerotier-cli get 9bee8941b5xxxxxx allowDefault
1
laduke added a commit that referenced this issue Jul 13, 2023
It doesn't work.
Not possible to fix with deficient network
stack and APIs.

ZeroTierOne-freebsd # zerotier-cli set 9bee8941b5xxxxxx allowDefault=1
400 set Allow Default does not work properly on FreeBSD. See #580
root@freebsd13-a:~/ZeroTierOne-freebsd # zerotier-cli get 9bee8941b5xxxxxx allowDefault
1
adamierymenko added a commit that referenced this issue Aug 23, 2023
* add note about forceTcpRelay

* Create a sample systemd unit for tcp proxy

* set gitattributes for rust & cargo so hashes dont conflict on Windows

* Revert "set gitattributes for rust & cargo so hashes dont conflict on Windows"

This reverts commit 032dc5c.

* Turn off autocrlf for rust source

Doesn't appear to play nice well when it comes to git and vendored cargo package hashes

* Fix #1883 (#1886)

Still unknown as to why, but the call to `nc->GetProperties()` can fail
when setting a friendly name on the Windows virtual ethernet adapter.
Ensure that `ncp` is not null before continuing and accessing the device
GUID.

* Don't vendor packages for zeroidc (#1885)

* Added docker environment way to join networks (#1871)

* add StringUtils

* fix headers
use recommended headers and remove unused headers

* move extern "C"
only JNI functions need to be exported

* cleanup

* fix ANDROID-50: RESULT_ERROR_BAD_PARAMETER typo

* fix typo in log message

* fix typos in JNI method signatures

* fix typo

* fix ANDROID-51: fieldName is uninitialized

* fix ANDROID-35: memory leak

* fix missing DeleteLocalRef in loops

* update to use unique error codes

* add GETENV macro

* add LOG_TAG defines

* ANDROID-48: add ZT_jnicache.cpp

* ANDROID-48: use ZT_jnicache.cpp and remove ZT_jnilookup.cpp and ZT_jniarray.cpp

* add Event.fromInt

* add PeerRole.fromInt

* add ResultCode.fromInt

* fix ANDROID-36: issues with ResultCode

* add VirtualNetworkConfigOperation.fromInt

* fix ANDROID-40: VirtualNetworkConfigOperation out-of-sync with ZT_VirtualNetworkConfigOperation enum

* add VirtualNetworkStatus.fromInt

* fix ANDROID-37: VirtualNetworkStatus out-of-sync with ZT_VirtualNetworkStatus enum

* add VirtualNetworkType.fromInt

* make NodeStatus a plain data class

* fix ANDROID-52: synchronization bug with nodeMap

* Node init work: separate Node construction and init

* add Node.toString

* make PeerPhysicalPath a plain data class

* remove unused PeerPhysicalPath.fixed

* add array functions

* make Peer a plain data class

* make Version a plain data class

* fix ANDROID-42: copy/paste error

* fix ANDROID-49: VirtualNetworkConfig.equals is wrong

* reimplement VirtualNetworkConfig.equals

* reimplement VirtualNetworkConfig.compareTo

* add VirtualNetworkConfig.hashCode

* make VirtualNetworkConfig a plain data class

* remove unused VirtualNetworkConfig.enabled

* reimplement VirtualNetworkDNS.equals

* add VirtualNetworkDNS.hashCode

* make VirtualNetworkDNS a plain data class

* reimplement VirtualNetworkRoute.equals

* reimplement VirtualNetworkRoute.compareTo

* reimplement VirtualNetworkRoute.toString

* add VirtualNetworkRoute.hashCode

* make VirtualNetworkRoute a plain data class

* add isSocketAddressEmpty

* add addressPort

* add fromSocketAddressObject

* invert logic in a couple of places and return early

* newInetAddress and newInetSocketAddress work
allow newInetSocketAddress to return NULL if given empty address

* fix ANDROID-38: stack corruption in onSendPacketRequested

* use GETENV macro

* JniRef work
JniRef does not use callbacks struct, so remove
fix NewGlobalRef / DeleteGlobalRef mismatch

* use PRId64 macros

* switch statement work

* comments and logging

* Modifier 'public' is redundant for interface members

* NodeException can be made a checked Exception

* 'NodeException' does not define a 'serialVersionUID' field

* 'finalize()' should not be overridden
this is fine to do because ZeroTierOneService calls close() when it is done

* error handling, error reporting, asserts, logging

* simplify loadLibrary

* rename Node.networks -> Node.networkConfigs

* Windows file permissions fix (#1887)

* Allow macOS interfaces to use multiple IP addresses (#1879)

Co-authored-by: Sean OMeara <[email protected]>
Co-authored-by: Grant Limberg <[email protected]>

* Fix condition where full HELLOs might not be sent when necessary (#1877)

Co-authored-by: Grant Limberg <[email protected]>

* 1.10.4 version bumps

* Add security policy to repo (#1889)

* [+] add e2k64 arch (#1890)

* temp fix for ANDROID-56: crash inside newNetworkConfig from too many args

* 1.10.4 release notes

* Windows 1.10.4 Advanced Installer bump

* Revert "temp fix for ANDROID-56: crash inside newNetworkConfig from too many args"

This reverts commit dd627cd.

* actual fix for ANDROID-56: crash inside newNetworkConfig
cast all arguments to varargs functions as good style

* Fix addIp being called with applied ips (#1897)

This was getting called outside of the check for existing ips
Because of the added ifdef and a brace getting moved to the
wrong place.

```
if (! n.tap()->addIp(*ip)) {
	fprintf(stderr, "ERROR: unable to add ip address %s" ZT_EOL_S, ip->toString(ipbuf));
}
WinFWHelper::newICMPRule(*ip, n.config().nwid);

```

* 1.10.5 (#1905)

* 1.10.5 bump

* 1.10.5 for Windows

* 1.10.5

* Prevent path-learning loops (#1914)

* Prevent path-learning loops

* Only allow new overwrite if not bonded

* fix binding temporary ipv6 addresses on macos (#1910)

The check code wasn't running.

I don't know why !defined(TARGET_OS_IOS) would exclude code on
desktop macOS. I did a quick search and changed it to defined(TARGET_OS_MAC).
Not 100% sure what the most correct solution there is.

You can verify the old and new versions with

`ifconfig | grep temporary`

plus

`zerotier-cli info -j` -> listeningOn

* 1.10.6 (#1929)

* 1.10.5 bump

* 1.10.6

* 1.10.6 AIP for Windows.

* Release notes for 1.10.6 (#1931)

* Minor tweak to Synology Docker image script (#1936)

* Change if_def again so ios can build (#1937)

All apple's variables are "defined"
but sometimes they are defined as "0"

* move begin/commit into try/catch block (#1932)

Thread was exiting in some cases

* Bump openssl from 0.10.45 to 0.10.48 in /zeroidc (#1938)

Bumps [openssl](https://github.com/sfackler/rust-openssl) from 0.10.45 to 0.10.48.
- [Release notes](https://github.com/sfackler/rust-openssl/releases)
- [Commits](sfackler/rust-openssl@openssl-v0.10.45...openssl-v0.10.48)

---
updated-dependencies:
- dependency-name: openssl
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* new drone bits

* Fix multiple network join from environment entrypoint.sh.release (#1961)

* _bond_m guards _bond, not _paths_m (#1965)

* Fix: warning: mutex '_aqm_m' is not held on every path through here [-Wthread-safety-analysis] (#1964)

* Bump h2 from 0.3.16 to 0.3.17 in /zeroidc (#1963)

Bumps [h2](https://github.com/hyperium/h2) from 0.3.16 to 0.3.17.
- [Release notes](https://github.com/hyperium/h2/releases)
- [Changelog](https://github.com/hyperium/h2/blob/master/CHANGELOG.md)
- [Commits](hyperium/h2@v0.3.16...v0.3.17)

---
updated-dependencies:
- dependency-name: h2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Grant Limberg <[email protected]>

* Add note that binutils is required on FreeBSD (#1968)

* Add prometheus metrics for Central controllers (#1969)

* add header-only prometheus lib to ext

* rename folder

* Undo rename directory

* prometheus simpleapi included on mac & linux

* wip

* wire up some controller stats

* Get windows building with prometheus

* bsd build flags for prometheus

* Fix multiple network join from environment entrypoint.sh.release (#1961)

* _bond_m guards _bond, not _paths_m (#1965)

* Fix: warning: mutex '_aqm_m' is not held on every path through here [-Wthread-safety-analysis] (#1964)

* Serve prom metrics from /metrics endpoint

* Add prom metrics for Central controller specific things

* reorganize metric initialization

* testing out a labled gauge on Networks

* increment error counter on throw

* Consolidate metrics definitions

Put all metric definitions into node/Metrics.hpp.  Accessed as needed
from there.

* Revert "testing out a labled gauge on Networks"

This reverts commit 499ed6d.

* still blows up but adding to the record for completeness right now

* Fix runtime issues with metrics

* Add metrics files to visual studio project

* Missed an "extern"

* add copyright headers to new files

* Add metrics for sent/received bytes (total)

* put /metrics endpoint behind auth

* sendto returns int on Win32

---------

Co-authored-by: Leonardo Amaral <[email protected]>
Co-authored-by: Brenton Bostick <[email protected]>

* Central startup update (#1973)

* allow specifying authtoken in central startup

* set allowManagedFrom

* move redis_mem_notification to the correct place

* add node checkins metric

* wire up min/max connection pool size metrics

* x86_64-unknown-linux-gnu on ubuntu runner (#1975)

* adding incoming zt packet type metrics (#1976)

* use cpp-httplib for HTTP control plane (#1979)

refactored the old control plane code to use [cpp-httplib](https://github.com/yhirose/cpp-httplib) instead of a hand rolled HTTP server.  Makes the control plane code much more legible.  Also no longer randomly stops responding.

* Outgoing Packet Metrics (#1980)

add tx/rx labels to packet counters and add metrics for outgoing packets

* Add short-term validation test workflow (#1974)

Add short-term validation test workflow

* Brenton/curly braces (#1971)

* fix formatting

* properly adjust various lines
breakup multiple statements onto multiple lines

* insert {} around if, for, etc.

* Fix rust dependency caching (#1983)

* fun with rust caching

* kick

* comment out invalid yaml keys for now

* Caching should now work

* re-add/rename key directives

* bump

* bump

* bump

* Don't force rebuild on Windows build GH Action (#1985)

Switching `/t:ZeroTierOne:Rebuild` to just `/t:ZeroTierOne` allows the Windows build to use the rust cache.  `/t:ZeroTierOne:Rebuild` cleared the cache before building.

* More packet metrics (#1982)

* found path negotation sends that weren't accounted for

* Fix histogram so it will actually compile

* Found more places for packet metrics

* separate the bind & listen calls on the http backplane (#1988)

* fix memory leak (#1992)

* fix a couple of metrics (#1989)

* More aggressive CLI spamming (#1993)

* fix type signatures (#1991)

* Network-metrics (#1994)

* Add a couple quick functions for converting a uint64_t network ID/node ID into std::string

* Network metrics

* Peer metrics (#1995)

* Adding peer metrics

still need to be wired up for use

* per peer packet metrics

* Fix crash from bad instantiation of histogram

* separate alive & dead path counts

* Add peer metric update block

* add peer latency values in doPingAndKeepalive

* prevent deadlock

* peer latency histogram actually works now

* cleanup

* capture counts of packets to specific peers

---------

Co-authored-by: Joseph Henry <[email protected]>

* Metrics consolidation (#1997)

* Rename zt_packet_incoming -> zt_packet

Also consolidate zt_peer_packets into a single metric with tx and rx labels.  Same for ztc_tcp_data and ztc_udp_data

* Further collapse tcp & udp into metric labels for zt_data

* Fix zt_data metric description

* zt_peer_packets description fix

* Consolidate incoming/outgoing network packets to a single metric

* zt_incoming_packet_error -> zt_packet_error

* Disable peer metrics for central controllers

Can change in the future if needed, but given the traffic our controllers serve, that's going to be a *lot* of data

* Disable peer metrics for controllers pt 2

* Update readme files for metrics (#2000)

* Controller Metrics & Network Config Request Fix (#2003)

* add new metrics for network config request queue size and sso expirations
* move sso expiration to its own thread in the controller
* fix potential undefined behavior when modifying a set

* Enable RTTI in Windows build

The new prometheus histogram stuff needs it.

Access violation - no RTTI data!INVALID packet 636ebd9ee8cac6c0 from cafe9efeb9(2605:9880:200:1200:30:571:e34:51/9993) (unexpected exception in tryDecode())

* Don't re-apply routes on BSD

See issue #1986

* Capture setContent by-value instead of by-reference (#2006)

Co-authored-by: Grant Limberg <[email protected]>

* fix typos (#2010)

* central controller metrics & request path updates (#2012)

* internal db metrics

* use shared mutexes for read/write locks

* remove this lock. only used for a metric

* more metrics

* remove exploratory metrics

place controller request benchmarks behind ifdef

* Improve validation test (#2013)

* fix init order for EmbeddedNetworkController (#2014)

* add constant for getifaddrs cache time

* cache getifaddrs - mac

* cache getifaddrs - linux

* cache getifaddrs - bsd

* cache getifaddrs - windows

* Fix oidc client lookup query

join condition referenced the wrong table.  Worked fine unless there were multiple identical client IDs

* Fix udp sent metric

was only incrementing by 1 for each packet sent

* Allow sending all surface addresses to peer in low-bandwidth mode

* allow enabling of low bandwidth mode on controllers

* don't unborrow bad connections

pool will clean them up later

* Multi-arch controller container (#2037)

create arm64 & amd64 images for central controller

* Update README.md

issue #2009

* docker tags change

* fix oidc auth url memory leak (#2031)

getAuthURL() was not calling zeroidc::free_cstr(url);

the only place authAuthURL is called, the url can be retrieved
from the network config instead.

You could alternatively copy the string and call free_cstr in getAuthURL.
If that's better we can change the PR.

Since now there are no callers of getAuthURL I deleted it.

Co-authored-by: Grant Limberg <[email protected]>

* Bump openssl from 0.10.48 to 0.10.55 in /zeroidc (#2034)

Bumps [openssl](https://github.com/sfackler/rust-openssl) from 0.10.48 to 0.10.55.
- [Release notes](https://github.com/sfackler/rust-openssl/releases)
- [Commits](sfackler/rust-openssl@openssl-v0.10.48...openssl-v0.10.55)

---
updated-dependencies:
- dependency-name: openssl
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Grant Limberg <[email protected]>

* zeroidc cargo warnings (#2029)

* fix unused struct member cargo warning

* fix unused import cargo warning

* fix unused return value cargo warning

---------

Co-authored-by: Grant Limberg <[email protected]>

* fix memory leak in macos ipv6/dns helper (#2030)

Co-authored-by: Grant Limberg <[email protected]>

* Consider ZEROTIER_JOIN_NETWORKS in healthcheck (#1978)

* Add a 2nd auth token only for access to /metrics (#2043)

* Add a 2nd auth token for /metrics

Allows administrators to distribute a token that only has access to read
metrics and nothing else.

Also added support for using bearer auth tokens for both types of tokens

Separate endpoint for metrics #2041

* Update readme

* fix a couple of cases of writing the wrong token

* Add warning to cli for allow default on FreeBSD

It doesn't work.
Not possible to fix with deficient network
stack and APIs.

ZeroTierOne-freebsd # zerotier-cli set 9bee8941b5xxxxxx allowDefault=1
400 set Allow Default does not work properly on FreeBSD. See #580
root@freebsd13-a:~/ZeroTierOne-freebsd # zerotier-cli get 9bee8941b5xxxxxx allowDefault
1

* ARM64 Support for TapDriver6 (#1949)

* Release memory previously allocated by UPNP_GetValidIGD

* Fix ifdef that breaks libzt on iOS (#2050)

* less drone (#2060)

* Exit if loading an invalid identity from disk (#2058)

* Exit if loading an invalid identity from disk

Previously, if an invalid identity was loaded from disk, ZeroTier would
generate a new identity & chug along and generate a brand new identity
as if nothing happened.  When running in containers, this introduces the
possibility for key matter loss; especially when running in containers
where the identity files are mounted in the container read only.  In
this case, ZT will continue chugging along with a brand new identity
with no possibility of recovering the private key.

ZeroTier should exit upon loading of invalid identity.public/identity.secret #2056

* add validation test for #2056

* tcp-proxy: fix build

* Adjust tcp-proxy makefile to support metrics

There's no way to get the metrics yet. Someone will
have to add the http service.

* remove ZT_NO_METRIC ifdef

* Implement recvmmsg() for Linux to reduce syscalls. (#2046)

Between 5% and 40% speed improvement on Linux, depending on system configuration and load.

* suppress warnings: comparison of integers of different signs: 'int64_t' (aka 'long') and 'uint64_t' (aka 'unsigned long') [-Wsign-compare] (#2063)

* fix warning: 'OS_STRING' macro redefined [-Wmacro-redefined] (#2064)

Even though this is in ext, these particular chunks of code were added
by us, so are ok to modify.

* Apply default route a different way - macOS

The original way we applied default route, by forking
0.0.0.0/0 into 0/1 and 128/1 works, but if mac os has any networking
hiccups -if you change SSIDs or sleep/wake- macos erases the system default route.
And then all networking on the computer is broken.

to summarize the new way:
allowDefault=1
```
sudo route delete default 192.168.82.1
sudo route add default 10.2.0.2
sudo route add -ifscope en1 default 192.168.82.1
```

gives us this routing table
```
Destination        Gateway            RT_IFA             Flags        Refs      Use    Mtu          Netif Expire    rtt(ms) rttvar(ms)
default            10.2.0.2           10.2.0.18          UGScg          90        1   2800       feth4823
default            192.168.82.1       192.168.82.217     UGScIg
```

allowDefault=0
```
sudo route delete default
sudo route delete -ifscope en1 default
sudo route add default 192.168.82.1
```

Notice the I flag, for -ifscope, on the physical default route.

route change does not seem to work reliably.

* fix docker tag for controllers (#2066)

* Update build.sh (#2068)

fix mkwork compilation errors

* Fix network DNS on macOS

It stopped working for ipv4 only networks in Monterey.
See #1696

We add some config like so to System Configuration

```
scutil
show State:/Network/Service/9bee8941b5xxxxxx/IPv4
<dictionary> {
  Addresses : <array> {
    0 : 10.2.1.36
  }
  InterfaceName : feth4823
  Router : 10.2.1.36
  ServerAddress : 127.0.0.1
}

```

* Add search domain to macos dns configuration

Stumbled upon this while debugging something else.
If we add search domain to our system configuration for
network DNS, then search domains work:

```
ping server1                                                                                                                                                                                    ~
PING server1.my.domain (10.123.3.1): 56 data bytes
64 bytes from 10.123.3.1
```

* Fix reporting of secondaryPort and tertiaryPort See: #2039

* Fix typos (#2075)

* Disable executable stacks on assembly objects (#2071)

Add `--noexecstack` to the assembler flags so the resulting binary
will link with a non-executable stack.

Fixes #1179

Co-authored-by: Joseph Henry <[email protected]>

* Test that starting zerotier before internet works

* Don't skip hellos when there are no paths available

working on #2082

* Update validate-1m-linux.sh

* Save zt node log files on abort

* Separate test and summary step in validator script

* Don't apply default route until zerotier is "online"

I was running into issues with restarting the zerotier service while
"full tunnel" mode is enabled.
When zerotier first boots, it gets network state from the cache
on disk. So it immediately applies all the routes it knew about
before it shutdown.
The network config may have change in this time.
If it has, then your default route is via a route
you are blocked from talking on. So you  can't get the current
network config, so your internet does not work.

Other options include
- don't use cached network state on boot
- find a better criteria than "online"

* Fix node time-to-online counter in validator script

* Export variables so that they are accessible by exit function

* Fix PortMapper issue on ZeroTier startup

See issue #2082

We use a call to libnatpmp::ininatpp to make sure the computer
has working network sockets before we go into the main
nat-pmp/upnp logic.

With basic exponenetial delay up to 30 seconds.

* testing

* Comment out PortMapper debug

this got left turned on in a confusing merge previously

* fix macos default route again

see commit fb6af19 * Fix network DNS on macOS
adding that stuff to System Config causes this extra route to be added
which breaks ipv4 default route.
We figured out a weird System Coniguration setting
that works.

--- old
couldn't figure out how to fix it in SystemConfiguration
so here we are# Please enter the commit message for your changes. Lines starting

We also moved the dns setter to before the syncIps stuff
to help with a race condition. It didn't always work when
you re-joined a network with default route enabled.

* Catch all conditions in switch statement, remove trailing whitespaces

* Add setmtu command, fix bond lifetime issue

* Basic cleanups

* Check if null is passed to VirtualNetworkConfig.equals and name fixes

* ANDROID-96: Simplify and use return code from node_init directly

* Windows arm64 (#2099)

* ARM64 changes for 1.12

* 1.12 Windows advanced installer updates and updates for ARM64

* 1.12.0

* Linux build fixes for old distros.

* release notes

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: travis laduke <[email protected]>
Co-authored-by: Grant Limberg <[email protected]>
Co-authored-by: Grant Limberg <[email protected]>
Co-authored-by: Leonardo Amaral <[email protected]>
Co-authored-by: Brenton Bostick <[email protected]>
Co-authored-by: Sean OMeara <[email protected]>
Co-authored-by: Joseph Henry <[email protected]>
Co-authored-by: Roman Peshkichev <[email protected]>
Co-authored-by: Joseph Henry <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Stavros Kois <[email protected]>
Co-authored-by: Jake Vis <[email protected]>
Co-authored-by: Jörg Thalheim <[email protected]>
Co-authored-by: lison <[email protected]>
Co-authored-by: Kenny MacDermid <[email protected]>
adamierymenko added a commit that referenced this issue Aug 25, 2023
* 1.10.6 merge to main (#1930)

* add note about forceTcpRelay

* Create a sample systemd unit for tcp proxy

* set gitattributes for rust & cargo so hashes dont conflict on Windows

* Revert "set gitattributes for rust & cargo so hashes dont conflict on Windows"

This reverts commit 032dc5c.

* Turn off autocrlf for rust source

Doesn't appear to play nice well when it comes to git and vendored cargo package hashes

* Fix #1883 (#1886)

Still unknown as to why, but the call to `nc->GetProperties()` can fail
when setting a friendly name on the Windows virtual ethernet adapter.
Ensure that `ncp` is not null before continuing and accessing the device
GUID.

* Don't vendor packages for zeroidc (#1885)

* Added docker environment way to join networks (#1871)

* add StringUtils

* fix headers
use recommended headers and remove unused headers

* move extern "C"
only JNI functions need to be exported

* cleanup

* fix ANDROID-50: RESULT_ERROR_BAD_PARAMETER typo

* fix typo in log message

* fix typos in JNI method signatures

* fix typo

* fix ANDROID-51: fieldName is uninitialized

* fix ANDROID-35: memory leak

* fix missing DeleteLocalRef in loops

* update to use unique error codes

* add GETENV macro

* add LOG_TAG defines

* ANDROID-48: add ZT_jnicache.cpp

* ANDROID-48: use ZT_jnicache.cpp and remove ZT_jnilookup.cpp and ZT_jniarray.cpp

* add Event.fromInt

* add PeerRole.fromInt

* add ResultCode.fromInt

* fix ANDROID-36: issues with ResultCode

* add VirtualNetworkConfigOperation.fromInt

* fix ANDROID-40: VirtualNetworkConfigOperation out-of-sync with ZT_VirtualNetworkConfigOperation enum

* add VirtualNetworkStatus.fromInt

* fix ANDROID-37: VirtualNetworkStatus out-of-sync with ZT_VirtualNetworkStatus enum

* add VirtualNetworkType.fromInt

* make NodeStatus a plain data class

* fix ANDROID-52: synchronization bug with nodeMap

* Node init work: separate Node construction and init

* add Node.toString

* make PeerPhysicalPath a plain data class

* remove unused PeerPhysicalPath.fixed

* add array functions

* make Peer a plain data class

* make Version a plain data class

* fix ANDROID-42: copy/paste error

* fix ANDROID-49: VirtualNetworkConfig.equals is wrong

* reimplement VirtualNetworkConfig.equals

* reimplement VirtualNetworkConfig.compareTo

* add VirtualNetworkConfig.hashCode

* make VirtualNetworkConfig a plain data class

* remove unused VirtualNetworkConfig.enabled

* reimplement VirtualNetworkDNS.equals

* add VirtualNetworkDNS.hashCode

* make VirtualNetworkDNS a plain data class

* reimplement VirtualNetworkRoute.equals

* reimplement VirtualNetworkRoute.compareTo

* reimplement VirtualNetworkRoute.toString

* add VirtualNetworkRoute.hashCode

* make VirtualNetworkRoute a plain data class

* add isSocketAddressEmpty

* add addressPort

* add fromSocketAddressObject

* invert logic in a couple of places and return early

* newInetAddress and newInetSocketAddress work
allow newInetSocketAddress to return NULL if given empty address

* fix ANDROID-38: stack corruption in onSendPacketRequested

* use GETENV macro

* JniRef work
JniRef does not use callbacks struct, so remove
fix NewGlobalRef / DeleteGlobalRef mismatch

* use PRId64 macros

* switch statement work

* comments and logging

* Modifier 'public' is redundant for interface members

* NodeException can be made a checked Exception

* 'NodeException' does not define a 'serialVersionUID' field

* 'finalize()' should not be overridden
this is fine to do because ZeroTierOneService calls close() when it is done

* error handling, error reporting, asserts, logging

* simplify loadLibrary

* rename Node.networks -> Node.networkConfigs

* Windows file permissions fix (#1887)

* Allow macOS interfaces to use multiple IP addresses (#1879)

Co-authored-by: Sean OMeara <[email protected]>
Co-authored-by: Grant Limberg <[email protected]>

* Fix condition where full HELLOs might not be sent when necessary (#1877)

Co-authored-by: Grant Limberg <[email protected]>

* 1.10.4 version bumps

* Add security policy to repo (#1889)

* [+] add e2k64 arch (#1890)

* temp fix for ANDROID-56: crash inside newNetworkConfig from too many args

* 1.10.4 release notes

* Windows 1.10.4 Advanced Installer bump

* Revert "temp fix for ANDROID-56: crash inside newNetworkConfig from too many args"

This reverts commit dd627cd.

* actual fix for ANDROID-56: crash inside newNetworkConfig
cast all arguments to varargs functions as good style

* Fix addIp being called with applied ips (#1897)

This was getting called outside of the check for existing ips
Because of the added ifdef and a brace getting moved to the
wrong place.

```
if (! n.tap()->addIp(*ip)) {
	fprintf(stderr, "ERROR: unable to add ip address %s" ZT_EOL_S, ip->toString(ipbuf));
}
WinFWHelper::newICMPRule(*ip, n.config().nwid);

```

* 1.10.5 (#1905)

* 1.10.5 bump

* 1.10.5 for Windows

* 1.10.5

* Prevent path-learning loops (#1914)

* Prevent path-learning loops

* Only allow new overwrite if not bonded

* fix binding temporary ipv6 addresses on macos (#1910)

The check code wasn't running.

I don't know why !defined(TARGET_OS_IOS) would exclude code on
desktop macOS. I did a quick search and changed it to defined(TARGET_OS_MAC).
Not 100% sure what the most correct solution there is.

You can verify the old and new versions with

`ifconfig | grep temporary`

plus

`zerotier-cli info -j` -> listeningOn

* 1.10.6 (#1929)

* 1.10.5 bump

* 1.10.6

* 1.10.6 AIP for Windows.

---------

Co-authored-by: travis laduke <[email protected]>
Co-authored-by: Grant Limberg <[email protected]>
Co-authored-by: Grant Limberg <[email protected]>
Co-authored-by: Leonardo Amaral <[email protected]>
Co-authored-by: Brenton Bostick <[email protected]>
Co-authored-by: Sean OMeara <[email protected]>
Co-authored-by: Joseph Henry <[email protected]>
Co-authored-by: Roman Peshkichev <[email protected]>

* 1.12.0 merge to main (#2104)

* add note about forceTcpRelay

* Create a sample systemd unit for tcp proxy

* set gitattributes for rust & cargo so hashes dont conflict on Windows

* Revert "set gitattributes for rust & cargo so hashes dont conflict on Windows"

This reverts commit 032dc5c.

* Turn off autocrlf for rust source

Doesn't appear to play nice well when it comes to git and vendored cargo package hashes

* Fix #1883 (#1886)

Still unknown as to why, but the call to `nc->GetProperties()` can fail
when setting a friendly name on the Windows virtual ethernet adapter.
Ensure that `ncp` is not null before continuing and accessing the device
GUID.

* Don't vendor packages for zeroidc (#1885)

* Added docker environment way to join networks (#1871)

* add StringUtils

* fix headers
use recommended headers and remove unused headers

* move extern "C"
only JNI functions need to be exported

* cleanup

* fix ANDROID-50: RESULT_ERROR_BAD_PARAMETER typo

* fix typo in log message

* fix typos in JNI method signatures

* fix typo

* fix ANDROID-51: fieldName is uninitialized

* fix ANDROID-35: memory leak

* fix missing DeleteLocalRef in loops

* update to use unique error codes

* add GETENV macro

* add LOG_TAG defines

* ANDROID-48: add ZT_jnicache.cpp

* ANDROID-48: use ZT_jnicache.cpp and remove ZT_jnilookup.cpp and ZT_jniarray.cpp

* add Event.fromInt

* add PeerRole.fromInt

* add ResultCode.fromInt

* fix ANDROID-36: issues with ResultCode

* add VirtualNetworkConfigOperation.fromInt

* fix ANDROID-40: VirtualNetworkConfigOperation out-of-sync with ZT_VirtualNetworkConfigOperation enum

* add VirtualNetworkStatus.fromInt

* fix ANDROID-37: VirtualNetworkStatus out-of-sync with ZT_VirtualNetworkStatus enum

* add VirtualNetworkType.fromInt

* make NodeStatus a plain data class

* fix ANDROID-52: synchronization bug with nodeMap

* Node init work: separate Node construction and init

* add Node.toString

* make PeerPhysicalPath a plain data class

* remove unused PeerPhysicalPath.fixed

* add array functions

* make Peer a plain data class

* make Version a plain data class

* fix ANDROID-42: copy/paste error

* fix ANDROID-49: VirtualNetworkConfig.equals is wrong

* reimplement VirtualNetworkConfig.equals

* reimplement VirtualNetworkConfig.compareTo

* add VirtualNetworkConfig.hashCode

* make VirtualNetworkConfig a plain data class

* remove unused VirtualNetworkConfig.enabled

* reimplement VirtualNetworkDNS.equals

* add VirtualNetworkDNS.hashCode

* make VirtualNetworkDNS a plain data class

* reimplement VirtualNetworkRoute.equals

* reimplement VirtualNetworkRoute.compareTo

* reimplement VirtualNetworkRoute.toString

* add VirtualNetworkRoute.hashCode

* make VirtualNetworkRoute a plain data class

* add isSocketAddressEmpty

* add addressPort

* add fromSocketAddressObject

* invert logic in a couple of places and return early

* newInetAddress and newInetSocketAddress work
allow newInetSocketAddress to return NULL if given empty address

* fix ANDROID-38: stack corruption in onSendPacketRequested

* use GETENV macro

* JniRef work
JniRef does not use callbacks struct, so remove
fix NewGlobalRef / DeleteGlobalRef mismatch

* use PRId64 macros

* switch statement work

* comments and logging

* Modifier 'public' is redundant for interface members

* NodeException can be made a checked Exception

* 'NodeException' does not define a 'serialVersionUID' field

* 'finalize()' should not be overridden
this is fine to do because ZeroTierOneService calls close() when it is done

* error handling, error reporting, asserts, logging

* simplify loadLibrary

* rename Node.networks -> Node.networkConfigs

* Windows file permissions fix (#1887)

* Allow macOS interfaces to use multiple IP addresses (#1879)

Co-authored-by: Sean OMeara <[email protected]>
Co-authored-by: Grant Limberg <[email protected]>

* Fix condition where full HELLOs might not be sent when necessary (#1877)

Co-authored-by: Grant Limberg <[email protected]>

* 1.10.4 version bumps

* Add security policy to repo (#1889)

* [+] add e2k64 arch (#1890)

* temp fix for ANDROID-56: crash inside newNetworkConfig from too many args

* 1.10.4 release notes

* Windows 1.10.4 Advanced Installer bump

* Revert "temp fix for ANDROID-56: crash inside newNetworkConfig from too many args"

This reverts commit dd627cd.

* actual fix for ANDROID-56: crash inside newNetworkConfig
cast all arguments to varargs functions as good style

* Fix addIp being called with applied ips (#1897)

This was getting called outside of the check for existing ips
Because of the added ifdef and a brace getting moved to the
wrong place.

```
if (! n.tap()->addIp(*ip)) {
	fprintf(stderr, "ERROR: unable to add ip address %s" ZT_EOL_S, ip->toString(ipbuf));
}
WinFWHelper::newICMPRule(*ip, n.config().nwid);

```

* 1.10.5 (#1905)

* 1.10.5 bump

* 1.10.5 for Windows

* 1.10.5

* Prevent path-learning loops (#1914)

* Prevent path-learning loops

* Only allow new overwrite if not bonded

* fix binding temporary ipv6 addresses on macos (#1910)

The check code wasn't running.

I don't know why !defined(TARGET_OS_IOS) would exclude code on
desktop macOS. I did a quick search and changed it to defined(TARGET_OS_MAC).
Not 100% sure what the most correct solution there is.

You can verify the old and new versions with

`ifconfig | grep temporary`

plus

`zerotier-cli info -j` -> listeningOn

* 1.10.6 (#1929)

* 1.10.5 bump

* 1.10.6

* 1.10.6 AIP for Windows.

* Release notes for 1.10.6 (#1931)

* Minor tweak to Synology Docker image script (#1936)

* Change if_def again so ios can build (#1937)

All apple's variables are "defined"
but sometimes they are defined as "0"

* move begin/commit into try/catch block (#1932)

Thread was exiting in some cases

* Bump openssl from 0.10.45 to 0.10.48 in /zeroidc (#1938)

Bumps [openssl](https://github.com/sfackler/rust-openssl) from 0.10.45 to 0.10.48.
- [Release notes](https://github.com/sfackler/rust-openssl/releases)
- [Commits](sfackler/rust-openssl@openssl-v0.10.45...openssl-v0.10.48)

---
updated-dependencies:
- dependency-name: openssl
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* new drone bits

* Fix multiple network join from environment entrypoint.sh.release (#1961)

* _bond_m guards _bond, not _paths_m (#1965)

* Fix: warning: mutex '_aqm_m' is not held on every path through here [-Wthread-safety-analysis] (#1964)

* Bump h2 from 0.3.16 to 0.3.17 in /zeroidc (#1963)

Bumps [h2](https://github.com/hyperium/h2) from 0.3.16 to 0.3.17.
- [Release notes](https://github.com/hyperium/h2/releases)
- [Changelog](https://github.com/hyperium/h2/blob/master/CHANGELOG.md)
- [Commits](hyperium/h2@v0.3.16...v0.3.17)

---
updated-dependencies:
- dependency-name: h2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Grant Limberg <[email protected]>

* Add note that binutils is required on FreeBSD (#1968)

* Add prometheus metrics for Central controllers (#1969)

* add header-only prometheus lib to ext

* rename folder

* Undo rename directory

* prometheus simpleapi included on mac & linux

* wip

* wire up some controller stats

* Get windows building with prometheus

* bsd build flags for prometheus

* Fix multiple network join from environment entrypoint.sh.release (#1961)

* _bond_m guards _bond, not _paths_m (#1965)

* Fix: warning: mutex '_aqm_m' is not held on every path through here [-Wthread-safety-analysis] (#1964)

* Serve prom metrics from /metrics endpoint

* Add prom metrics for Central controller specific things

* reorganize metric initialization

* testing out a labled gauge on Networks

* increment error counter on throw

* Consolidate metrics definitions

Put all metric definitions into node/Metrics.hpp.  Accessed as needed
from there.

* Revert "testing out a labled gauge on Networks"

This reverts commit 499ed6d.

* still blows up but adding to the record for completeness right now

* Fix runtime issues with metrics

* Add metrics files to visual studio project

* Missed an "extern"

* add copyright headers to new files

* Add metrics for sent/received bytes (total)

* put /metrics endpoint behind auth

* sendto returns int on Win32

---------

Co-authored-by: Leonardo Amaral <[email protected]>
Co-authored-by: Brenton Bostick <[email protected]>

* Central startup update (#1973)

* allow specifying authtoken in central startup

* set allowManagedFrom

* move redis_mem_notification to the correct place

* add node checkins metric

* wire up min/max connection pool size metrics

* x86_64-unknown-linux-gnu on ubuntu runner (#1975)

* adding incoming zt packet type metrics (#1976)

* use cpp-httplib for HTTP control plane (#1979)

refactored the old control plane code to use [cpp-httplib](https://github.com/yhirose/cpp-httplib) instead of a hand rolled HTTP server.  Makes the control plane code much more legible.  Also no longer randomly stops responding.

* Outgoing Packet Metrics (#1980)

add tx/rx labels to packet counters and add metrics for outgoing packets

* Add short-term validation test workflow (#1974)

Add short-term validation test workflow

* Brenton/curly braces (#1971)

* fix formatting

* properly adjust various lines
breakup multiple statements onto multiple lines

* insert {} around if, for, etc.

* Fix rust dependency caching (#1983)

* fun with rust caching

* kick

* comment out invalid yaml keys for now

* Caching should now work

* re-add/rename key directives

* bump

* bump

* bump

* Don't force rebuild on Windows build GH Action (#1985)

Switching `/t:ZeroTierOne:Rebuild` to just `/t:ZeroTierOne` allows the Windows build to use the rust cache.  `/t:ZeroTierOne:Rebuild` cleared the cache before building.

* More packet metrics (#1982)

* found path negotation sends that weren't accounted for

* Fix histogram so it will actually compile

* Found more places for packet metrics

* separate the bind & listen calls on the http backplane (#1988)

* fix memory leak (#1992)

* fix a couple of metrics (#1989)

* More aggressive CLI spamming (#1993)

* fix type signatures (#1991)

* Network-metrics (#1994)

* Add a couple quick functions for converting a uint64_t network ID/node ID into std::string

* Network metrics

* Peer metrics (#1995)

* Adding peer metrics

still need to be wired up for use

* per peer packet metrics

* Fix crash from bad instantiation of histogram

* separate alive & dead path counts

* Add peer metric update block

* add peer latency values in doPingAndKeepalive

* prevent deadlock

* peer latency histogram actually works now

* cleanup

* capture counts of packets to specific peers

---------

Co-authored-by: Joseph Henry <[email protected]>

* Metrics consolidation (#1997)

* Rename zt_packet_incoming -> zt_packet

Also consolidate zt_peer_packets into a single metric with tx and rx labels.  Same for ztc_tcp_data and ztc_udp_data

* Further collapse tcp & udp into metric labels for zt_data

* Fix zt_data metric description

* zt_peer_packets description fix

* Consolidate incoming/outgoing network packets to a single metric

* zt_incoming_packet_error -> zt_packet_error

* Disable peer metrics for central controllers

Can change in the future if needed, but given the traffic our controllers serve, that's going to be a *lot* of data

* Disable peer metrics for controllers pt 2

* Update readme files for metrics (#2000)

* Controller Metrics & Network Config Request Fix (#2003)

* add new metrics for network config request queue size and sso expirations
* move sso expiration to its own thread in the controller
* fix potential undefined behavior when modifying a set

* Enable RTTI in Windows build

The new prometheus histogram stuff needs it.

Access violation - no RTTI data!INVALID packet 636ebd9ee8cac6c0 from cafe9efeb9(2605:9880:200:1200:30:571:e34:51/9993) (unexpected exception in tryDecode())

* Don't re-apply routes on BSD

See issue #1986

* Capture setContent by-value instead of by-reference (#2006)

Co-authored-by: Grant Limberg <[email protected]>

* fix typos (#2010)

* central controller metrics & request path updates (#2012)

* internal db metrics

* use shared mutexes for read/write locks

* remove this lock. only used for a metric

* more metrics

* remove exploratory metrics

place controller request benchmarks behind ifdef

* Improve validation test (#2013)

* fix init order for EmbeddedNetworkController (#2014)

* add constant for getifaddrs cache time

* cache getifaddrs - mac

* cache getifaddrs - linux

* cache getifaddrs - bsd

* cache getifaddrs - windows

* Fix oidc client lookup query

join condition referenced the wrong table.  Worked fine unless there were multiple identical client IDs

* Fix udp sent metric

was only incrementing by 1 for each packet sent

* Allow sending all surface addresses to peer in low-bandwidth mode

* allow enabling of low bandwidth mode on controllers

* don't unborrow bad connections

pool will clean them up later

* Multi-arch controller container (#2037)

create arm64 & amd64 images for central controller

* Update README.md

issue #2009

* docker tags change

* fix oidc auth url memory leak (#2031)

getAuthURL() was not calling zeroidc::free_cstr(url);

the only place authAuthURL is called, the url can be retrieved
from the network config instead.

You could alternatively copy the string and call free_cstr in getAuthURL.
If that's better we can change the PR.

Since now there are no callers of getAuthURL I deleted it.

Co-authored-by: Grant Limberg <[email protected]>

* Bump openssl from 0.10.48 to 0.10.55 in /zeroidc (#2034)

Bumps [openssl](https://github.com/sfackler/rust-openssl) from 0.10.48 to 0.10.55.
- [Release notes](https://github.com/sfackler/rust-openssl/releases)
- [Commits](sfackler/rust-openssl@openssl-v0.10.48...openssl-v0.10.55)

---
updated-dependencies:
- dependency-name: openssl
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Grant Limberg <[email protected]>

* zeroidc cargo warnings (#2029)

* fix unused struct member cargo warning

* fix unused import cargo warning

* fix unused return value cargo warning

---------

Co-authored-by: Grant Limberg <[email protected]>

* fix memory leak in macos ipv6/dns helper (#2030)

Co-authored-by: Grant Limberg <[email protected]>

* Consider ZEROTIER_JOIN_NETWORKS in healthcheck (#1978)

* Add a 2nd auth token only for access to /metrics (#2043)

* Add a 2nd auth token for /metrics

Allows administrators to distribute a token that only has access to read
metrics and nothing else.

Also added support for using bearer auth tokens for both types of tokens

Separate endpoint for metrics #2041

* Update readme

* fix a couple of cases of writing the wrong token

* Add warning to cli for allow default on FreeBSD

It doesn't work.
Not possible to fix with deficient network
stack and APIs.

ZeroTierOne-freebsd # zerotier-cli set 9bee8941b5xxxxxx allowDefault=1
400 set Allow Default does not work properly on FreeBSD. See #580
root@freebsd13-a:~/ZeroTierOne-freebsd # zerotier-cli get 9bee8941b5xxxxxx allowDefault
1

* ARM64 Support for TapDriver6 (#1949)

* Release memory previously allocated by UPNP_GetValidIGD

* Fix ifdef that breaks libzt on iOS (#2050)

* less drone (#2060)

* Exit if loading an invalid identity from disk (#2058)

* Exit if loading an invalid identity from disk

Previously, if an invalid identity was loaded from disk, ZeroTier would
generate a new identity & chug along and generate a brand new identity
as if nothing happened.  When running in containers, this introduces the
possibility for key matter loss; especially when running in containers
where the identity files are mounted in the container read only.  In
this case, ZT will continue chugging along with a brand new identity
with no possibility of recovering the private key.

ZeroTier should exit upon loading of invalid identity.public/identity.secret #2056

* add validation test for #2056

* tcp-proxy: fix build

* Adjust tcp-proxy makefile to support metrics

There's no way to get the metrics yet. Someone will
have to add the http service.

* remove ZT_NO_METRIC ifdef

* Implement recvmmsg() for Linux to reduce syscalls. (#2046)

Between 5% and 40% speed improvement on Linux, depending on system configuration and load.

* suppress warnings: comparison of integers of different signs: 'int64_t' (aka 'long') and 'uint64_t' (aka 'unsigned long') [-Wsign-compare] (#2063)

* fix warning: 'OS_STRING' macro redefined [-Wmacro-redefined] (#2064)

Even though this is in ext, these particular chunks of code were added
by us, so are ok to modify.

* Apply default route a different way - macOS

The original way we applied default route, by forking
0.0.0.0/0 into 0/1 and 128/1 works, but if mac os has any networking
hiccups -if you change SSIDs or sleep/wake- macos erases the system default route.
And then all networking on the computer is broken.

to summarize the new way:
allowDefault=1
```
sudo route delete default 192.168.82.1
sudo route add default 10.2.0.2
sudo route add -ifscope en1 default 192.168.82.1
```

gives us this routing table
```
Destination        Gateway            RT_IFA             Flags        Refs      Use    Mtu          Netif Expire    rtt(ms) rttvar(ms)
default            10.2.0.2           10.2.0.18          UGScg          90        1   2800       feth4823
default            192.168.82.1       192.168.82.217     UGScIg
```

allowDefault=0
```
sudo route delete default
sudo route delete -ifscope en1 default
sudo route add default 192.168.82.1
```

Notice the I flag, for -ifscope, on the physical default route.

route change does not seem to work reliably.

* fix docker tag for controllers (#2066)

* Update build.sh (#2068)

fix mkwork compilation errors

* Fix network DNS on macOS

It stopped working for ipv4 only networks in Monterey.
See #1696

We add some config like so to System Configuration

```
scutil
show State:/Network/Service/9bee8941b5xxxxxx/IPv4
<dictionary> {
  Addresses : <array> {
    0 : 10.2.1.36
  }
  InterfaceName : feth4823
  Router : 10.2.1.36
  ServerAddress : 127.0.0.1
}

```

* Add search domain to macos dns configuration

Stumbled upon this while debugging something else.
If we add search domain to our system configuration for
network DNS, then search domains work:

```
ping server1                                                                                                                                                                                    ~
PING server1.my.domain (10.123.3.1): 56 data bytes
64 bytes from 10.123.3.1
```

* Fix reporting of secondaryPort and tertiaryPort See: #2039

* Fix typos (#2075)

* Disable executable stacks on assembly objects (#2071)

Add `--noexecstack` to the assembler flags so the resulting binary
will link with a non-executable stack.

Fixes #1179

Co-authored-by: Joseph Henry <[email protected]>

* Test that starting zerotier before internet works

* Don't skip hellos when there are no paths available

working on #2082

* Update validate-1m-linux.sh

* Save zt node log files on abort

* Separate test and summary step in validator script

* Don't apply default route until zerotier is "online"

I was running into issues with restarting the zerotier service while
"full tunnel" mode is enabled.
When zerotier first boots, it gets network state from the cache
on disk. So it immediately applies all the routes it knew about
before it shutdown.
The network config may have change in this time.
If it has, then your default route is via a route
you are blocked from talking on. So you  can't get the current
network config, so your internet does not work.

Other options include
- don't use cached network state on boot
- find a better criteria than "online"

* Fix node time-to-online counter in validator script

* Export variables so that they are accessible by exit function

* Fix PortMapper issue on ZeroTier startup

See issue #2082

We use a call to libnatpmp::ininatpp to make sure the computer
has working network sockets before we go into the main
nat-pmp/upnp logic.

With basic exponenetial delay up to 30 seconds.

* testing

* Comment out PortMapper debug

this got left turned on in a confusing merge previously

* fix macos default route again

see commit fb6af19 * Fix network DNS on macOS
adding that stuff to System Config causes this extra route to be added
which breaks ipv4 default route.
We figured out a weird System Coniguration setting
that works.

--- old
couldn't figure out how to fix it in SystemConfiguration
so here we are# Please enter the commit message for your changes. Lines starting

We also moved the dns setter to before the syncIps stuff
to help with a race condition. It didn't always work when
you re-joined a network with default route enabled.

* Catch all conditions in switch statement, remove trailing whitespaces

* Add setmtu command, fix bond lifetime issue

* Basic cleanups

* Check if null is passed to VirtualNetworkConfig.equals and name fixes

* ANDROID-96: Simplify and use return code from node_init directly

* Windows arm64 (#2099)

* ARM64 changes for 1.12

* 1.12 Windows advanced installer updates and updates for ARM64

* 1.12.0

* Linux build fixes for old distros.

* release notes

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: travis laduke <[email protected]>
Co-authored-by: Grant Limberg <[email protected]>
Co-authored-by: Grant Limberg <[email protected]>
Co-authored-by: Leonardo Amaral <[email protected]>
Co-authored-by: Brenton Bostick <[email protected]>
Co-authored-by: Sean OMeara <[email protected]>
Co-authored-by: Joseph Henry <[email protected]>
Co-authored-by: Roman Peshkichev <[email protected]>
Co-authored-by: Joseph Henry <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Stavros Kois <[email protected]>
Co-authored-by: Jake Vis <[email protected]>
Co-authored-by: Jörg Thalheim <[email protected]>
Co-authored-by: lison <[email protected]>
Co-authored-by: Kenny MacDermid <[email protected]>

* Fix primary port binding issue in 1.12 (#2107)

* Add test for primary port bindings to validator - See #2105

* Add delay to binding test

* Remove TCP binding logic from Binder to fix #2105

* add second control plane socket for ipv6

* fix controller network post endpoint

* exit if we can't bind at least one of IPV4 or IPV6 for control plane port

---------

Co-authored-by: Grant Limberg <[email protected]>

* Version bump, Linux version stuff, Debian dependencies from 1.12.0 rebuild, release notes.

* macOS version bump in installer

* Windows version bump.

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: travis laduke <[email protected]>
Co-authored-by: Grant Limberg <[email protected]>
Co-authored-by: Grant Limberg <[email protected]>
Co-authored-by: Leonardo Amaral <[email protected]>
Co-authored-by: Brenton Bostick <[email protected]>
Co-authored-by: Sean OMeara <[email protected]>
Co-authored-by: Joseph Henry <[email protected]>
Co-authored-by: Roman Peshkichev <[email protected]>
Co-authored-by: Joseph Henry <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Stavros Kois <[email protected]>
Co-authored-by: Jake Vis <[email protected]>
Co-authored-by: Jörg Thalheim <[email protected]>
Co-authored-by: lison <[email protected]>
Co-authored-by: Kenny MacDermid <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Bug Bug to be resolved
Projects
None yet
Development

No branches or pull requests

5 participants