[tor-dev] prop224: HSDir caches question with OOM

Thu Apr 21 09:55:26 UTC 2016

Tim Wilson-Brown - teor <teor2345 at gmail.com> writes:

> [ text/plain ]
>
>> On 20 Apr 2016, at 07:22, David Goulet <dgoulet at ev0ke.net> wrote:
>> 
>> On 18 Apr (13:18:25), George Kadianakis wrote:
>>> Tim Wilson-Brown - teor <teor2345 at gmail.com> writes:
>>> 
>>>> [ text/plain ]
>>>> 
>>>>> On 16 Apr 2016, at 05:47, David Goulet <dgoulet at ev0ke.net> wrote:
>>>>> 
>>>>> Hi!
>>>>> 
>>>>> (For the following, I'm only talking about HS directory.)
>>>>> 
>>>>> Here is my conundrum. I was working on plugging the new cache for proposal 224
>>>>> into our OOM. I stumbled upon a fun problem.
>>>>> 
>>>>> <snip>
>>>> 
>>>> We should add entries to a time-sorted list when they are created, otherwise this algorithm is O(n^2).
>>>> 
>>>> A third alternative is that we can iterate through each time period:
>>>> Set K to the oldest expected descriptor age in hours, minus 1 hour
>>>> Deallocate all entries from Cache A that are older than K hours
>>>> Deallocate all entries from Cache B that are older than K hours
>>>> Set K to K - 1 and repeat this process
>>>> 
>>>> This algorithm is O(Kn), which is ok as long as K is small.
>>>> This carries a slight risk of over-deallocating cache entries. Which is OK at OOM time.
>>>> I like this one, because it's simple, performant, and doesn't need any extra memory allocations.
>>>> 
>>> 
>>> I feel this is a reasonable approach here. It will probably have to be combined
>>> with time-sorted lists as teor suggested, otherwise each of those steps will
>>> require walking through the whole cache every time.
>
> Yes, that's why I said O(K*n), which is alright as long as K is small.
> If we keep sorted lists (warn: v2 code changes), then the algorithm is O(n) instead.
>
>>> 
>>> BTW, before it says "Set K to K - 1 and repeat this process" we should also
>>> check whether our memory issue has been addressed and if so then we can exit,
>>> right?
>> 
>> Yeah I would do that. Every time we cleanup a cache, we check if we've reached
>> our bytes cutoff.
>
> Yes, we can terminate the algorithm as soon as we reach our goal.
>
>>> In that case, maybe we can add the same check between these two checks:
>>>> Deallocate all entries from Cache A that are older than K hours
>>>> Deallocate all entries from Cache B that are older than K hours
>>> so that maybe sometimes we can get away with just removing stuff from the v2 cache.
>>> 
>>> Are there any other issues with the above approach?
>> 
>> While implementing this solution, I found one annoying issue :).
>> 
>> In order to compute K, I use the maximum lifetime of a cache entry. So
>> basically if my maximum lifetime in hours is M=10 then K will start at 10 - 1.
>> (Where 1 hour is basically the "assumed" HS post period.)
>
> Note that we can decrement K by a multiple of the post period - for example,
> it can be 3 hours: 9, 6, 3, 0.
>
> This makes the algorithm run faster (as it would iterate through the unsorted v3
> list 11 times, rather than 33 times), at the cost of sometimes deallocating some
> more recent descriptors in that 3 hour period.
>
>> However, our v2 and v3 cache have different lifetimes.
>> 
>>    v2: 48 hours is the max lifetime (REND_CACHE_MAX_AGE) plus the maximum
>>        clock skew of 24 hours (REND_CACHE_MAX_SKEW) == 72 hours.
>> 
>>    v3: We've settled on 33 hours which is 24h time period, 6 hours of overlap
>>        period and 3 hours of maximum clock skew.
>> 
>> So somehow, we need to "unify" K to fit both caches. Some options (some I do
>> not like):
>> 
>> 1) We could iterate over both caches at first and find out the oldest
>>   descriptor which requires a first pass of O(n) + O(m). Use that value for
>>   our starting point of K. But then inherently, we'll most of the time wipe v2
>>   descriptors since they have a long lifetime. (maybe good!?)
>
> I think that clients are just as likely to re-use descriptors that are an hour old,
> regardless of the maximum lifetimes of v2 and v3 descriptors.
>
> So I don't care if we take out more v2 descriptors.
>

I feel this is a fine way to do this.

Since hidden services reupload their descriptor every 1 hour (both in v2 and
v3), the oldest descriptors will always belong to dead hidden services, or to
hidden services that have since rotated HSDirs. So I think it's OK to wipe
super old v2 descriptors first.

>> 2) We could be crazier and set the v2 max lifetime to be the same as the v3...
>>   (FYI, we already have an open ticket about that to bring it down).
>
> I am ok with this, but we have no stats on it, so it's hard to tell what impact it
> will have (if any).
>
> Cutting the size of the v2 cache in half would also make space for the v3 cache
> during the transition period. So I support this idea.
>

This also seems like a plausible approach, especially since it seems easier to
implement than (1). 

Before doing this we should make sure that nothing will break if we bring the
72 hours v2 lifetime down to 33 hours. Personally, I can't think of anything
breaking right now.