reproman.utils¶
-
class
reproman.utils.
PathRoot
(predicate)[source]¶ Bases:
object
Find the root of paths based on a predicate function.
The path -> root mapping is cached across calls.
Parameters: predicate (callable) – A callable that will be passed a path and should return true if that path should be considered a root.
-
class
reproman.utils.
SemanticVersion
(major, minor, patch, tag)¶ Bases:
tuple
-
major
¶ Alias for field number 0
-
minor
¶ Alias for field number 1
-
patch
¶ Alias for field number 2
-
tag
¶ Alias for field number 3
-
-
reproman.utils.
any_re_search
(regexes, value)[source]¶ Return if any of regexes (list or str) searches succesfully for value
-
reproman.utils.
assure_bytes
(s, encoding='utf-8')[source]¶ Convert/encode unicode to bytes if of ‘str’
Parameters: encoding (str, optional) – Encoding to use. “utf-8” is the default
-
reproman.utils.
assure_dict_from_str
(s, **kwargs)[source]¶ Given a multiline string with key=value items convert it to a dictionary
Parameters:
-
reproman.utils.
assure_dir
(*args)[source]¶ Make sure directory exists.
Joins the list of arguments to an os-specific path to the desired directory and creates it, if it not exists yet.
-
reproman.utils.
assure_list
(s)[source]¶ Given not a list, would place it into a list. If None - empty list is returned
Parameters: s (list or anything) –
-
reproman.utils.
assure_list_from_str
(s, sep='\n')[source]¶ Given a multiline string convert it to a list of return None if empty
Parameters: s (str or list) –
-
reproman.utils.
assure_tuple_or_list
(obj)[source]¶ Given an object, wrap into a tuple if not list or tuple
-
reproman.utils.
assure_unicode
(s, encoding=None, confidence=None)[source]¶ Convert/decode to str if of ‘bytes’
Parameters:
-
reproman.utils.
attrib
(*args, **kwargs)[source]¶ Extend the attr.ib to include our metadata elements.
ATM we support additional keyword args which are then stored within metadata: - doc for documentation to describe the attribute (e.g. in –help)
Also, when the default argument of attr.ib is unspecified, set it to None.
-
reproman.utils.
auto_repr
(cls)[source]¶ Decorator for a class to assign it an automagic quick and dirty __repr__
It uses public class attributes to prepare repr of a class
Original idea: http://stackoverflow.com/a/27799004/1265472
-
reproman.utils.
cached_property
(prop)[source]¶ Cache a property’s return value.
This avoids using lru_cache, which is more complicated than needed for simple properties and isn’t available in Python 2’s stdlib.
Use this only if the property’s return value is constant over the life of the object. This isn’t appropriate for a property with a setter or a property whose getter value may change based some outside state.
This should be positioned below the @property declaration.
-
class
reproman.utils.
chpwd
(path, mkdir=False, logsuffix='')[source]¶ Bases:
object
Wrapper around os.chdir which also adjusts environ[‘PWD’]
The reason is that otherwise PWD is simply inherited from the shell and we have no ability to assess directory path without dereferencing symlinks.
If used as a context manager it allows to temporarily change directory to the given path
-
reproman.utils.
cmd_err_filter
(err_string)[source]¶ Creates a filter for CommandErrors that match a specific error string
Parameters: err_string (basestring) – The error string we want to match Returns: Return type: func object -> boolean
-
reproman.utils.
command_as_string
(command)[source]¶ Convert command to the string representation.
Parameters: command (list or str) – If it is a list, convert it to a string, quoting each element as needed. If it is a string, it is returned as is.
-
reproman.utils.
escape_filename
(filename)[source]¶ Surround filename in “” and escape ” in the filename
-
reproman.utils.
execute_command_batch
(session, command, args, exception_filter=None)[source]¶ Generator that executes session.execute_command, with batches of args
We want to call commands like “apt-cache policy” on a large number of packages, but risk creating command-lines that are too long. This function is a generator that will call execute_command but with batches of arguments (to stay within the command-line length limit) and yield the results.
Parameters: - session – Session object that implements the execute_command() member
- command (sequence) – The command that we wish to execute
- args (sequence) – The long list of additional arguments we wish to pass to the command
- exception_filter (func x -> bool) – A filter of exception types that the calling code will gracefully handle
Returns: stdout of the command, stderr of the command, and an exception that is in the list of expected exceptions
Return type: (out, err, exception)
-
reproman.utils.
expandpath
(path, force_absolute=True)[source]¶ Expand all variables and user handles in a path.
By default return an absolute path
-
reproman.utils.
file_basename
(name, return_ext=False)[source]¶ Strips up to 2 extensions of length up to 4 characters and starting with alpha not a digit, so we could get rid of .tar.gz etc
-
reproman.utils.
find_files
(regex, topdir='.', exclude=None, exclude_vcs=True, exclude_reproman=False, dirs=False)[source]¶ Generator to find files matching regex
Parameters: - regex (basestring) –
- exclude (basestring, optional) – Matches to exclude
- exclude_vcs – If True, excludes commonly known VCS subdirectories. If string, used as regex to exclude those files (regex: ‘/.(?:git|gitattributes|svn|bzr|hg)(?:/|$)’)
- exclude_reproman – If True, excludes files known to be reproman meta-data files (e.g. under .reproman/ subdirectory) (regex: ‘/.(?:reproman)(?:/|$)’)
- topdir (basestring, optional) – Directory where to search
- dirs (bool, optional) – Either to match directories as well as files
-
reproman.utils.
generate_unique_name
(pattern, nameset)[source]¶ Create a unique numbered name from a pattern and a set
Parameters: - pattern (basestring) – The pattern for the name (to be used with %) that includes one %d location
- nameset (collection) – Collection (set or list) of existing names. If the generated name is used, then add the name to the nameset.
Returns: The generated unique name
Return type:
-
reproman.utils.
get_cmd_batch_len
(arg_list, cmd_len)[source]¶ Estimate the maximum batch length for a given argument list
To make sure we don’t call shell commands with too many arguments this function looks at an argument list and the command length without any arguments, and estimates the number of arguments we want to batch together at one time.
Parameters: - arg_list (list) – The list to process in the command
- cmd_len (number) – The length of the command without arguments
Returns: The maximum number in a single batch
Return type: number
-
reproman.utils.
get_func_kwargs_doc
(func)[source]¶ Provides args for a function
Parameters: func (str) – name of the function from which args are being requested Returns: of the args that a function takes in Return type: list
-
reproman.utils.
get_tempfile_kwargs
(tkwargs={}, prefix='', wrapped=None)[source]¶ Updates kwargs to be passed to tempfile. calls depending on env vars
-
reproman.utils.
getpwd
()[source]¶ Try to return a CWD without dereferencing possible symlinks
If no PWD found in the env, output of getcwd() is returned
-
reproman.utils.
instantiate_attr_object
(item_type, items)[source]¶ Instantiate item_type given items (for a list or dict)
Provides a more informative exception message in case if some arguments are incorrect
-
reproman.utils.
is_binarystring
(s)[source]¶ Return true if an object is a binary string (not unicode)
-
reproman.utils.
is_explicit_path
(path)[source]¶ Return whether a path explicitly points to a location
Any absolute path, or relative path starting with either ‘../’ or ‘./’ is assumed to indicate a location on the filesystem. Any other path format is not considered explicit.
-
reproman.utils.
is_subpath
(path, directory)[source]¶ Test whether path is below (or is itself) directory.
Symbolic links are not resolved before the check.
-
reproman.utils.
items_to_dict
(l, attrs='name', ordered=False)[source]¶ Given a list of attr instances, return a dict using specified attrs as keys
Parameters: Raises: ValueError
– If there is a conflict - multiple items with the same attrs used for keyReturns: Return type:
-
reproman.utils.
join_sequence_of_dicts
(seq)[source]¶ Joins a sequence of dicts into a single dict
Parameters: seq (sequence) – Sequence of dicts to join Returns: Return type: dict Raises: RuntimeError if a duplicate key is encountered.
-
reproman.utils.
knows_annex
(path)[source]¶ Returns whether at a given path there is information about an annex
It is just a thin wrapper around GitRepo.is_with_annex() classmethod which also checks for path to exist first.
This includes actually present annexes, but also uninitialized ones, or even the presence of a remote annex branch.
-
reproman.utils.
line_profile
(func)[source]¶ Q&D helper to line profile the function and spit out stats
-
reproman.utils.
lmtime
(filepath, mtime)[source]¶ Set mtime for files, while not de-referencing symlinks.
To overcome absence of os.lutime
Works only on linux and OSX ATM
-
reproman.utils.
make_tempfile
(content=None, wrapped=None, **tkwargs)[source]¶ Helper class to provide a temporary file name and remove it at the end (context manager)
Parameters: - mkdir (bool, optional (default: False)) – If True, temporary directory created using tempfile.mkdtemp()
- content (str or bytes, optional) – Content to be stored in the file created
- wrapped (function, optional) – If set, function name used to prefix temporary file name
- **tkwargs – All other arguments are passed into the call to tempfile.mk{,d}temp(), and resultant temporary filename is passed as the first argument into the function t. If no ‘prefix’ argument is provided, it will be constructed using module and function names (‘.’ replaced with ‘_’).
- change the used directory without providing keyword argument 'dir' set (To) –
- REPROMAN_TESTS_TEMPDIR. –
Examples
>>> from os.path import exists >>> from reproman.utils import make_tempfile >>> with make_tempfile() as fname: ... k = open(fname, 'w').write('silly test') >>> assert not exists(fname) # was removed
>>> with make_tempfile(content="blah") as fname: ... assert open(fname).read() == "blah"
-
reproman.utils.
merge_dicts
(ds)[source]¶ Convert an iterable of dictionaries.
In the case of key collisions, the last value wins.
Parameters: ds (iterable of dicts) – Returns: Return type: dict
-
reproman.utils.
not_supported_on_windows
(msg=None)[source]¶ A little helper to be invoked to consistently fail whenever functionality is not supported (yet) on Windows
-
reproman.utils.
only_with_values
(d)[source]¶ Given a dictionary, return the one only with entries which had non-null values
-
reproman.utils.
optional_args
(decorator)[source]¶ allows a decorator to take optional positional and keyword arguments. Assumes that taking a single, callable, positional argument means that it is decorating a function, i.e. something like this:
@my_decorator def function(): pass
Calls decorator with decorator(f, *args, **kwargs)
-
reproman.utils.
parse_kv_list
(params)[source]¶ Create a dict from a “key=value” list.
Parameters: params (sequence of str or mapping) – For a sequence, each item should have the form “<key>=<value”. If params is a mapping, it will be returned as is. Returns: Return type: A mapping from backend key to value. Raises: ValueError if item in params does not match expected “key=value” format.
-
reproman.utils.
parse_semantic_version
(version)[source]¶ Split version into major, minor, patch, and tag components.
Parameters: version (str) – A version string X.Y.Z. X, Y, and Z must be digits. Any remaining text is treated as a tag (e.g., “-rc1”). Returns: Return type: A namedtuple with the form (major, minor, patch, tag)
-
reproman.utils.
partition
(items, predicate=<class 'bool'>)[source]¶ Partition items by predicate.
Parameters: - items (iterable) –
- predicate (callable) – A function that will be mapped over each element in items. The elements will partitioned based on whether the return value is false or true.
Returns: - A tuple with two generators, the first for ‘false’ items and the second for
- ’true’ ones.
Notes
Taken from Peter Otten’s snippet posted at https://nedbatchelder.com/blog/201306/filter_a_list_into_two_parts.html
-
reproman.utils.
pycache_source
(path)[source]¶ Map a pycache path to the original path.
Parameters: path (str) – A Python cache file. Returns: - Path of cached Python file (str) or None if path doesn’t look like a
- cache file.
-
reproman.utils.
rmtemp
(f, *args, **kwargs)[source]¶ Wrapper to centralize removing of temp files so we could keep them around
It will not remove the temporary file/directory if REPROMAN_TESTS_KEEPTEMP environment variable is defined
-
reproman.utils.
rmtree
(path, chmod_files='auto', *args, **kwargs)[source]¶ To remove git-annex .git it is needed to make all files and directories writable again first
Parameters: - chmod_files (string or bool, optional) – Either to make files writable also before removal. Usually it is just a matter of directories to have write permissions. If ‘auto’ it would chmod files on windows by default
- *args –
- **kwargs – Passed into shutil.rmtree call
-
reproman.utils.
rotree
(path, ro=True, chmod_files=True)[source]¶ To make tree read-only or writable
Parameters:
-
reproman.utils.
safe_write
(ostream, s, encoding='utf-8')[source]¶ Safely write different string types to an output stream
-
reproman.utils.
setup_exceptionhook
(ipython=False)[source]¶ Overloads default sys.excepthook with our exceptionhook handler.
If interactive, our exceptionhook handler will invoke pdb.post_mortem; if not interactive, then invokes default handler.
-
reproman.utils.
swallow_outputs
()[source]¶ Context manager to help consuming both stdout and stderr, and print()
stdout is available as cm.out and stderr as cm.err whenever cm is the yielded context manager. Internally uses temporary files to guarantee absent side-effects of swallowing into StringIO which lacks .fileno.
print mocking is necessary for some uses where sys.stdout was already bound to original sys.stdout, thus mocking it later had no effect. Overriding print function had desired effect
-
reproman.utils.
to_binarystring
(s, encoding='utf-8')[source]¶ Converts any type string to binarystring
-
reproman.utils.
unique
(seq, key=None)[source]¶ Given a sequence return a list only with unique elements while maintaining order
This is the fastest solution. See https://www.peterbe.com/plog/uniqifiers-benchmark and http://stackoverflow.com/a/480227/1265472 for more information. Enhancement – added ability to compare for uniqueness using a key function
Parameters: - seq – Sequence to analyze
- key (callable, optional) – Function to call on each element so we could decide not on a full element, but on its member etc
-
reproman.utils.
updated
(d, update)[source]¶ Return a copy of the input with the ‘update’
Primarily for updating dictionaries
-
reproman.utils.
write_update
(fname, content, encoding=None)[source]¶ Write content to fname unless it already has matching content.
This is the same as simply writing the content, except no writing occurs if the content of the existing file matches, the write or update is logged, and the leading directories of fname are created if needed.
Parameters: