[tor-commits] [stem/master] Don't clone immutables when parsing descriptors
atagar at torproject.org
atagar at torproject.org
Mon Jan 30 18:31:26 UTC 2017
commit 1a50ce30e95dd11dbc80167f4c4df2df0b05f55b
Author: Damian Johnson <atagar at torproject.org>
Date: Tue Jan 24 10:44:01 2017 -0800
Don't clone immutables when parsing descriptors
Avoiding an unnecessary copy call on types which are immutable. Seems even for
empty lists and dictionaries calling the constructor rather than copy() is a
tad faster.
On my wee little netbook this speeds up reading microdescriptors by ~5% when
there's validation and speeds our integ tests a bit...
Before
descriptor.microdescriptor... success (15.38s)
descriptor.networkstatus... success (23.25s)
After
descriptor.microdescriptor... success (13.96s)
descriptor.networkstatus... success (22.77s)
This should help all descriptor reading though when validation is disabled this
is boon is only when accessing attributes.
---
docs/change_log.rst | 1 +
stem/descriptor/__init__.py | 12 +++++++++++-
2 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/docs/change_log.rst b/docs/change_log.rst
index e16a948..477db2e 100644
--- a/docs/change_log.rst
+++ b/docs/change_log.rst
@@ -45,6 +45,7 @@ The following are only available within Stem's `git repository
* **Descriptors**
+ * Sped descriptor reading by ~5% by not cloning immutable fields
* Support for protocol descriptor fields (:spec:`eb4fb3c`)
* Shared randomness properties weren't being read in votes (:trac:`21102`)
diff --git a/stem/descriptor/__init__.py b/stem/descriptor/__init__.py
index d7b1aea..67ea374 100644
--- a/stem/descriptor/__init__.py
+++ b/stem/descriptor/__init__.py
@@ -78,6 +78,7 @@ KEYWORD_LINE = re.compile('^([%s]+)(?:[%s]+(.*))?$' % (KEYWORD_CHAR, WHITESPACE)
SPECIFIC_KEYWORD_LINE = '^(%%s)(?:[%s]+(.*))?$' % WHITESPACE
PGP_BLOCK_START = re.compile('^-----BEGIN ([%s%s]+)-----$' % (KEYWORD_CHAR, WHITESPACE))
PGP_BLOCK_END = '-----END %s-----'
+EMPTY_COLLECTION = ([], {}, set())
DocumentHandler = stem.util.enum.UppercaseEnum(
'ENTRIES',
@@ -519,7 +520,16 @@ class Descriptor(object):
for attr in self.ATTRIBUTES:
if not hasattr(self, attr):
- setattr(self, attr, copy.copy(self.ATTRIBUTES[attr][0]))
+ value = self.ATTRIBUTES[attr][0]
+
+ if value is None or isinstance(value, (bool, stem.exit_policy.ExitPolicy)):
+ pass # immutable
+ elif value in EMPTY_COLLECTION:
+ value = type(value)() # collection construction tad faster than copy
+ else:
+ value = copy.copy(value)
+
+ setattr(self, attr, value)
for keyword, values in list(entries.items()):
try:
More information about the tor-commits
mailing list