How can I parse a YAML file in Python?
Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...
How can I parse a YAML file in Python?
The easiest and purest method without relying on C headers is PyYaml (documentation), which can be installed via pip install pyyaml
:
#!/usr/bin/env python
import yaml
with open("example.yaml", "r") as stream:
try:
print(yaml.safe_load(stream))
except yaml.YAMLError as exc:
print(exc)
And that s it. A plain yaml.load()
function also exists, but yaml.safe_load()
should always be preferred to avoid introducing the possibility for arbitrary code execution. So unless you explicitly need the arbitrary object serialization/deserialization use safe_load
.
Note the PyYaml project supports versions up through the YAML 1.1 specification. If YAML 1.2 specification support is needed, see ruamel.yaml as noted in this answer.
Also, you could also use a drop in replacement for pyyaml, that keeps your yaml file ordered the same way you had it, called oyaml. View synk of oyaml here
# -*- coding: utf-8 -*-
import yaml
import io
# Define data
data = {
a list : [
1,
42,
3.141,
1337,
help ,
u €
],
a string : bla ,
another dict : {
foo : bar ,
key : value ,
the answer : 42
}
}
# Write YAML file
with io.open( data.yaml , w , encoding= utf8 ) as outfile:
yaml.dump(data, outfile, default_flow_style=False, allow_unicode=True)
# Read YAML file
with open("data.yaml", r ) as stream:
data_loaded = yaml.safe_load(stream)
print(data == data_loaded)
a list:
- 1
- 42
- 3.141
- 1337
- help
- €
a string: bla
another dict:
foo: bar
key: value
the answer: 42
.yml
and .yaml
For your application, the following might be important:
See also: Comparison of data serialization formats
In case you are rather looking for a way to make configuration files, you might want to read my short article Configuration files in Python
If you have YAML that conforms to the YAML 1.2 specification (released 2009) then you should use ruamel.yaml (disclaimer: I am the author of that package). It is essentially a superset of PyYAML, which supports most of YAML 1.1 (from 2005).
If you want to be able to preserve your comments when round-tripping, you certainly should use ruamel.yaml.
Upgrading @Jon s example is easy:
import ruamel.yaml as yaml
with open("example.yaml") as stream:
try:
print(yaml.safe_load(stream))
except yaml.YAMLError as exc:
print(exc)
Use safe_load()
unless you really have full control over the input, need it (seldom the case) and know what you are doing.
If you are using pathlib Path
for manipulating files, you are better of using the new API ruamel.yaml provides:
from ruamel.yaml import YAML
from pathlib import Path
path = Path( example.yaml )
yaml = YAML(typ= safe )
data = yaml.load(path)
First install pyyaml using pip3.
Then import yaml module and load the file into a dictionary called my_dict :
import yaml
with open( filename.yaml ) as f:
my_dict = yaml.safe_load(f)
That s all you need. Now the entire yaml file is in my_dict dictionary.
To access any element of a list in a YAML file like this:
global:
registry:
url: dtr-:5000/
repoPath:
dbConnectionString: jdbc:oracle:thin:@x.x.x.x:1521:abcd
You can use following python script:
import yaml
with open("/some/path/to/yaml.file", r ) as f:
valuesYaml = yaml.load(f, Loader=yaml.FullLoader)
print(valuesYaml[ global ][ dbConnectionString ])
Example:
defaults.yaml
url: https://www.google.com
environment.py
from ruamel import yaml
data = yaml.safe_load(open( defaults.yaml ))
data[ url ]
I use ruamel.yaml. Details & debate here.
from ruamel import yaml
with open(filename, r ) as fp:
read_data = yaml.load(fp)
Usage of ruamel.yaml is compatible (with some simple solvable problems) with old usages of PyYAML and as it is stated in link I provided, use
from ruamel import yaml
instead of
import yaml
and it will fix most of your problems.
EDIT: PyYAML is not dead as it turns out, it s just maintained in a different place.
I made my own script for this. Feel free to use it, as long as you keep the attribution. The script can parse yaml from a file (function load
), parse yaml from a string (function loads
) and convert a dictionary into yaml (function dumps
). It respects all variable types.
# © didlly AGPL-3.0 License - github.com/didlly
def is_float(string: str) -> bool:
try:
float(string)
return True
except ValueError:
return False
def is_integer(string: str) -> bool:
try:
int(string)
return True
except ValueError:
return False
def load(path: str) -> dict:
with open(path, "r") as yaml:
levels = []
data = {}
indentation_str = ""
for line in yaml.readlines():
if line.replace(line.lstrip(), "") != "" and indentation_str == "":
indentation_str = line.replace(line.lstrip(), "").rstrip("
")
if line.strip() == "":
continue
elif line.rstrip()[-1] == ":":
key = line.strip()[:-1]
quoteless = (
is_float(key)
or is_integer(key)
or key == "True"
or key == "False"
or ("[" in key and "]" in key)
)
if len(line.replace(line.strip(), "")) // 2 < len(levels):
if quoteless:
levels[len(line.replace(line.strip(), "")) // 2] = f"[{key}]"
else:
levels[len(line.replace(line.strip(), "")) // 2] = f"[ {key} ]"
else:
if quoteless:
levels.append(f"[{line.strip()[:-1]}]")
else:
levels.append(f"[ {line.strip()[:-1]} ]")
if quoteless:
exec(
f"data{ .join(str(i) for i in levels[:line.replace(line.lstrip(), ).count(indentation_str) if indentation_str != else 0])}[{key}]"
+ " = {}"
)
else:
exec(
f"data{ .join(str(i) for i in levels[:line.replace(line.lstrip(), ).count(indentation_str) if indentation_str != else 0])}[ {key} ]"
+ " = {}"
)
continue
key = line.split(":")[0].strip()
value = ":".join(line.split(":")[1:]).strip()
if (
is_float(value)
or is_integer(value)
or value == "True"
or value == "False"
or ("[" in value and "]" in value)
):
if (
is_float(key)
or is_integer(key)
or key == "True"
or key == "False"
or ("[" in key and "]" in key)
):
exec(
f"data{ if line == line.strip() else .join(str(i) for i in levels[:line.replace(line.lstrip(), ).count(indentation_str) if indentation_str != else 0])}[{key}] = {value}"
)
else:
exec(
f"data{ if line == line.strip() else .join(str(i) for i in levels[:line.replace(line.lstrip(), ).count(indentation_str) if indentation_str != else 0])}[ {key} ] = {value}"
)
else:
if (
is_float(key)
or is_integer(key)
or key == "True"
or key == "False"
or ("[" in key and "]" in key)
):
exec(
f"data{ if line == line.strip() else .join(str(i) for i in levels[:line.replace(line.lstrip(), ).count(indentation_str) if indentation_str != else 0])}[{key}] = {value} "
)
else:
exec(
f"data{ if line == line.strip() else .join(str(i) for i in levels[:line.replace(line.lstrip(), ).count(indentation_str) if indentation_str != else 0])}[ {key} ] = {value} "
)
return data
def loads(yaml: str) -> dict:
levels = []
data = {}
indentation_str = ""
for line in yaml.split("
"):
if line.replace(line.lstrip(), "") != "" and indentation_str == "":
indentation_str = line.replace(line.lstrip(), "")
if line.strip() == "":
continue
elif line.rstrip()[-1] == ":":
key = line.strip()[:-1]
quoteless = (
is_float(key)
or is_integer(key)
or key == "True"
or key == "False"
or ("[" in key and "]" in key)
)
if len(line.replace(line.strip(), "")) // 2 < len(levels):
if quoteless:
levels[len(line.replace(line.strip(), "")) // 2] = f"[{key}]"
else:
levels[len(line.replace(line.strip(), "")) // 2] = f"[ {key} ]"
else:
if quoteless:
levels.append(f"[{line.strip()[:-1]}]")
else:
levels.append(f"[ {line.strip()[:-1]} ]")
if quoteless:
exec(
f"data{ .join(str(i) for i in levels[:line.replace(line.lstrip(), ).count(indentation_str) if indentation_str != else 0])}[{key}]"
+ " = {}"
)
else:
exec(
f"data{ .join(str(i) for i in levels[:line.replace(line.lstrip(), ).count(indentation_str) if indentation_str != else 0])}[ {key} ]"
+ " = {}"
)
continue
key = line.split(":")[0].strip()
value = ":".join(line.split(":")[1:]).strip()
if (
is_float(value)
or is_integer(value)
or value == "True"
or value == "False"
or ("[" in value and "]" in value)
):
if (
is_float(key)
or is_integer(key)
or key == "True"
or key == "False"
or ("[" in key and "]" in key)
):
exec(
f"data{ if line == line.strip() else .join(str(i) for i in levels[:line.replace(line.lstrip(), ).count(indentation_str) if indentation_str != else 0])}[{key}] = {value}"
)
else:
exec(
f"data{ if line == line.strip() else .join(str(i) for i in levels[:line.replace(line.lstrip(), ).count(indentation_str) if indentation_str != else 0])}[ {key} ] = {value}"
)
else:
if (
is_float(key)
or is_integer(key)
or key == "True"
or key == "False"
or ("[" in key and "]" in key)
):
exec(
f"data{ if line == line.strip() else .join(str(i) for i in levels[:line.replace(line.lstrip(), ).count(indentation_str) if indentation_str != else 0])}[{key}] = {value} "
)
else:
exec(
f"data{ if line == line.strip() else .join(str(i) for i in levels[:line.replace(line.lstrip(), ).count(indentation_str) if indentation_str != else 0])}[ {key} ] = {value} "
)
return data
def dumps(yaml: dict, indent="") -> str:
"""A procedure which converts the dictionary passed to the procedure into it s yaml equivalent.
Args:
yaml (dict): The dictionary to be converted.
Returns:
data (str): The dictionary in yaml form.
"""
data = ""
for key in yaml.keys():
if type(yaml[key]) == dict:
data += f"
{indent}{key}:
"
data += dumps(yaml[key], f"{indent} ")
else:
data += f"{indent}{key}: {yaml[key]}
"
return data
print(load("config.yml"))
config.yml
level 0 value: 0
level 1:
level 1 value: 1
level 2:
level 2 value: 2
level 1 2:
level 1 2 value: 1 2
level 2 2:
level 2 2 value: 2 2
{ level 0 value : 0, level 1 : { level 1 value : 1, level 2 : { level 2 value : 2}}, level 1 2 : { level 1 2 value : 1 2 , level 2 2 : { level 2 2 value : 2 2}}}
#!/usr/bin/env python
import sys
import yaml
def main(argv):
with open(argv[0]) as stream:
try:
#print(yaml.load(stream))
return 0
except yaml.YAMLError as exc:
print(exc)
return 1
if __name__ == "__main__":
sys.exit(main(sys.argv[1:]))
read_yaml_file function returning all data into a dictionary.
def read_yaml_file(full_path=None, relative_path=None):
if relative_path is not None:
resource_file_location_local = ProjectPaths.get_project_root_path() + relative_path
else:
resource_file_location_local = full_path
with open(resource_file_location_local, r ) as stream:
try:
file_artifacts = yaml.safe_load(stream)
except yaml.YAMLError as exc:
print(exc)
return dict(file_artifacts.items())
Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...
I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...
Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...
Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...
I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...
Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...
Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...
I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...