vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I've been using redundant OpenBSD/PF (pfsync,CARP) configuration with large traffic telecommunication system. It has performed rather well, but the way PF handles state expiration has lately become an issue. PF expires states in periodic sweeps in the state structure. This causes jitter in traffic, and with our typical loads (>5000 new states/s) also congestion especially in the synchronization network interface. This causes backup gateway to lose some state updates and as a result accumulate long living TCP ESTABLISHED:ESTABLISHED states which are already purged in the active PF machine. I propose the following solution: PF state entries should be modified to include one more RB-tree entry, used for state expiration. This tree would be primarily keyed with pf_state_expire() return values (modified to return 0 instead of time_second for instant expire). The tree would be secondarily keyed by state ids to prevent duplicate keys. All places of PF code modifying timeout or expire values should be changed to call update function for this new tree. So the comparison function for the new tree would be: static __inline int pf_state_compare_expire(struct pf_state *a, struct pf_state *b) { if (a->expire_key > b->expire_key) return (1); if (a->expire_key < b->expire_key) return (-1); return pf_state_compare_id(a, b); } And the expire/timeout update function: void pf_update_state_expire(struct pf_state *cur, u_int32_t expire, u_int8_t timeout) { u_int32_t new_expire_key; new_expire_key=pf_state_expires(cur); cur->timeout=timeout; cur->expire=expire; if(new_expire_key != cur->expire_key) { RB_REMOVE(pf_state_tree_expires, &tree_expires, cur); cur->expire_key = new_expire_key; RB_INSERT(pf_state_tree_expires, &tree_expires, cur); } } After this modification, the state purging function can be changed so that it takes RB_MIN from this expiration tree and expire only that state if it is old enough. Since RB_MIN is always the next to expire, no sweep is needed. Purging should purge only one state to prevent long lock-ups. However, the purge check is now so cheap it can be called also from both pf_test and pf_insert_state. Now the load of purging would be divided evenly. Of course not all operations currently done in the purging function can be done this way, because pool_* functions shouldn't be called without a process context (as far as I understand the issue). The purge can leave a list of states to be freed and the actual freeing of memory could be done in the periodic cleanup quickly. Arguably this method causes some additional CPU load. However, at least in my measurements I have found it next to impossible to get much more than 50% CPU utilization in a firewall with OpenBSD 3.8 without significant packet or state lossage due to state expiration thread. Comments or better solution ideas? -- Teemu J. Takanen Product Security Manager, Tecnomen Messaging Services |