pydicom.dataset.Dataset¶
-
class
pydicom.dataset.
Dataset
(*args, **kwargs)¶ Contains a collection (dictionary) of DICOM Data Elements.
Behaves like a
dict
.Note
Dataset
is only derived fromdict
to make it work in a NumPyndarray
. The parentdict
class is never called, as alldict
methods are overridden.Examples
Add an element to the
Dataset
(for elements in the DICOM dictionary):>>> ds = Dataset() >>> ds.PatientName = "CITIZEN^Joan" >>> ds.add_new(0x00100020, 'LO', '12345') >>> ds[0x0010, 0x0030] = DataElement(0x00100030, 'DA', '20010101')
Add a sequence element to the
Dataset
>>> ds.BeamSequence = [Dataset(), Dataset(), Dataset()] >>> ds.BeamSequence[0].Manufacturer = "Linac, co." >>> ds.BeamSequence[1].Manufacturer = "Linac and Sons, co." >>> ds.BeamSequence[2].Manufacturer = "Linac and Daughters, co."
Add private elements to the
Dataset
>>> block = ds.private_block(0x0041, 'My Creator', create=True) >>> block.add_new(0x01, 'LO', '12345')
Updating and retrieving element values:
>>> ds.PatientName = "CITIZEN^Joan" >>> ds.PatientName 'CITIZEN^Joan' >>> ds.PatientName = "CITIZEN^John" >>> ds.PatientName 'CITIZEN^John'
Retrieving an element’s value from a Sequence:
>>> ds.BeamSequence[0].Manufacturer 'Linac, co.' >>> ds.BeamSequence[1].Manufacturer 'Linac and Sons, co.'
Accessing the
DataElement
items:>>> elem = ds['PatientName'] >>> elem (0010, 0010) Patient's Name PN: 'CITIZEN^John' >>> elem = ds[0x00100010] >>> elem (0010, 0010) Patient's Name PN: 'CITIZEN^John' >>> elem = ds.data_element('PatientName') >>> elem (0010, 0010) Patient's Name PN: 'CITIZEN^John'
Accessing a private
DataElement
item:>>> block = ds.private_block(0x0041, 'My Creator') >>> elem = block[0x01] >>> elem (0041, 1001) Private tag data LO: '12345' >>> elem.value '12345'
Alternatively:
>>> ds.get_private_item(0x0041, 0x01, 'My Creator').value '12345'
Deleting an element from the
Dataset
>>> del ds.PatientID >>> del ds.BeamSequence[1].Manufacturer >>> del ds.BeamSequence[2]
Deleting a private element from the
Dataset
>>> block = ds.private_block(0x0041, 'My Creator') >>> if 0x01 in block: ... del block[0x01]
Determining if an element is present in the
Dataset
>>> 'PatientName' in ds True >>> 'PatientID' in ds False >>> (0x0010, 0x0030) in ds True >>> 'Manufacturer' in ds.BeamSequence[0] True
Iterating through the top level of a
Dataset
only (excluding Sequences):>>> for elem in ds: ... print(elem) (0010, 0010) Patient's Name PN: 'CITIZEN^John'
Iterating through the entire
Dataset
(including Sequences):>>> for elem in ds.iterall(): ... print(elem) (0010, 0010) Patient's Name PN: 'CITIZEN^John'
Recursively iterate through a
Dataset
(including Sequences):>>> def recurse(ds): ... for elem in ds: ... if elem.VR == 'SQ': ... [recurse(item) for item in elem] ... else: ... # Do something useful with each DataElement
Converting the
Dataset
to and from JSON:>>> ds = Dataset() >>> ds.PatientName = "Some^Name" >>> jsonmodel = ds.to_json() >>> ds2 = Dataset() >>> ds2.from_json(jsonmodel) (0010, 0010) Patient's Name PN: u'Some^Name'
-
default_element_format
¶ The default formatting for string display.
- Type
str
-
default_sequence_element_format
¶ The default formatting for string display of sequences.
- Type
str
-
indent_chars
¶ For string display, the characters used to indent nested Sequences. Default is
" "
.- Type
str
-
is_little_endian
¶ Shall be set before writing with
write_like_original=False
. TheDataset
(excluding the pixel data) will be written using the given endianess.- Type
bool
-
is_implicit_VR
¶ Shall be set before writing with
write_like_original=False
. TheDataset
will be written using the transfer syntax with the given VR handling, e.g Little Endian Implicit VR ifTrue
, and Little Endian Explicit VR or Big Endian Explicit VR (depending onDataset.is_little_endian
) ifFalse
.- Type
bool
Methods
__init__
(*args, **kwargs)Create a new
Dataset
instance.add
(data_element)Add an element to the
Dataset
.add_new
(tag, VR, value)Create a new element and add it to the
Dataset
.clear
()Delete all the elements from the
Dataset
.convert_pixel_data
([handler_name])Convert pixel data to a
numpy.ndarray
internally.copy
()data_element
(name)Return the element corresponding to the element keyword name.
decode
()Apply character set decoding to the elements in the
Dataset
.decompress
([handler_name])Decompresses Pixel Data and modifies the
Dataset
in-place.dir
(*filters)Return an alphabetical list of element keywords in the
Dataset
.elements
()Yield the top-level elements of the
Dataset
.Create an empty
Dataset.file_meta
if none exists.fix_meta_info
([enforce_standard])Ensure the file meta info exists and has the correct values for transfer syntax and media storage UIDs.
formatted_lines
([element_format, …])Iterate through the
Dataset
yielding formattedstr
for each element.from_json
(json_dataset[, bulk_data_uri_handler])Add elements to the
Dataset
from DICOM JSON format.fromkeys
Create a new dictionary with keys from iterable and values set to value.
get
(key[, default])Simulate
dict.get()
to handle element tags and keywords.get_item
(key)Return the raw data element if possible.
get_private_item
(group, element_offset, …)Return the data element for the given private tag group.
group_dataset
(group)Return a
Dataset
containing only elements of a certain group.items
()Return the
Dataset
items to simulatedict.items()
.iterall
()Iterate through the
Dataset
, yielding all the elements.keys
()Return the
Dataset
keys to simulatedict.keys()
.overlay_array
(group)Return the Overlay Data in group as a
numpy.ndarray
.pop
(key, *args)Emulate
dict.pop()
with support for tags and keywords.popitem
()Remove and return a (key, value) pair as a 2-tuple.
private_block
(group, private_creator[, create])Return the block for the given tag group and private_creator.
private_creators
(group)Return a list of private creator names in the given group.
Remove all private elements from the
Dataset
.save_as
(filename[, write_like_original])Write the
Dataset
to filename.set_original_encoding
(is_implicit_vr, …)Set the values for the original transfer syntax and encoding.
setdefault
(key[, default])Emulate
dict.setdefault()
with support for tags and keywords.to_json
([bulk_data_threshold, …])Return a JSON representation of the
Dataset
.to_json_dict
([bulk_data_threshold, …])Return a dictionary representation of the
Dataset
conforming to the DICOM JSON Model as described in the DICOM Standard, Part 18, Annex F.top
()Return a
str
representation of the top level elements.Return a
list
of valid names for auto-completion code.update
(dictionary)Extend
dict.update()
to handle DICOM tags and keywords.values
()Return the
Dataset
values to simulatedict.values()
.walk
(callback[, recursive])Iterate through the
Dataset's
elements and run callback on each.Attributes
Return
True
if the encoding to be used for writing is set and is the same as that used to originally encode theDataset
.Return the pixel data as a
numpy.ndarray
.-
add
(data_element)¶ Add an element to the
Dataset
.Equivalent to
ds[data_element.tag] = data_element
- Parameters
data_element (dataelem.DataElement) – The
DataElement
to add.
-
add_new
(tag, VR, value)¶ Create a new element and add it to the
Dataset
.- Parameters
tag – The DICOM (group, element) tag in any form accepted by
Tag()
such as[0x0010, 0x0010]
,(0x10, 0x10)
,0x00100010
, etc.VR (str) – The 2 character DICOM value representation (see DICOM Standard, Part 5, Section 6.2).
value –
The value of the data element. One of the following:
a single string or number
a
list
ortuple
with all strings or all numbersa multi-value string with backslash separator
for a sequence element, an empty
list
orlist
ofDataset
-
convert_pixel_data
(handler_name='')¶ Convert pixel data to a
numpy.ndarray
internally.- Parameters
handler_name (str, optional) – The name of the pixel handler that shall be used to decode the data. Supported names are:
'gdcm'
,'pillow'
,'jpeg_ls'
,'rle'
and'numpy'
. If not used (the default), a matching handler is used from the handlers configured inpixel_data_handlers
.- Returns
Converted pixel data is stored internally in the dataset.
- Return type
None
- Raises
ValueError – If handler_name is not a valid handler name.
NotImplementedError – If the given handler or any handler, if none given, is unable to decompress pixel data with the current transfer syntax
RuntimeError – If the given handler, or the handler that has been selected if none given, is not available.
Notes
If the pixel data is in a compressed image format, the data is decompressed and any related data elements are changed accordingly.
-
data_element
(name)¶ Return the element corresponding to the element keyword name.
- Parameters
name (str) – A DICOM element keyword.
- Returns
For the given DICOM element keyword, return the corresponding
DataElement
if present,None
otherwise.- Return type
dataelem.DataElement or None
-
decode
()¶ Apply character set decoding to the elements in the
Dataset
.See DICOM Standard, Part 5, Section 6.1.1.
-
decompress
(handler_name='')¶ Decompresses Pixel Data and modifies the
Dataset
in-place.New in version 1.4: The handler_name keyword argument was added
If not a compressed transfer syntax, then pixel data is converted to a
numpy.ndarray
internally, but not returned.If compressed pixel data, then is decompressed using an image handler, and internal state is updated appropriately:
Dataset.file_meta.TransferSyntaxUID
is updated to non-compressed formis_undefined_length
isFalse
for the (7FE0,0010) Pixel Data element.
Changed in version 1.4: The handler_name keyword argument was added
- Parameters
handler_name (str, optional) – The name of the pixel handler that shall be used to decode the data. Supported names are:
'gdcm'
,'pillow'
,'jpeg_ls'
,'rle'
and'numpy'
. If not used (the default), a matching handler is used from the handlers configured inpixel_data_handlers
.- Returns
- Return type
None
- Raises
NotImplementedError – If the pixel data was originally compressed but file is not Explicit VR Little Endian as required by the DICOM Standard.
-
dir
(*filters)¶ Return an alphabetical list of element keywords in the
Dataset
.Intended mainly for use in interactive Python sessions. Only lists the element keywords in the current level of the
Dataset
(i.e. the contents of any sequence elements are ignored).- Parameters
filters (str) – Zero or more string arguments to the function. Used for case-insensitive match to any part of the DICOM keyword.
- Returns
The matching element keywords in the dataset. If no filters are used then all element keywords are returned.
- Return type
list of str
-
elements
()¶ Yield the top-level elements of the
Dataset
.New in version 1.1.
Examples
>>> ds = Dataset() >>> for elem in ds.elements(): ... print(elem)
The elements are returned in the same way as in
Dataset.__getitem__()
.- Yields
dataelem.DataElement or dataelem.RawDataElement – The unconverted elements sorted by increasing tag order.
-
ensure_file_meta
()¶ Create an empty
Dataset.file_meta
if none exists.New in version 1.2.
-
fix_meta_info
(enforce_standard=True)¶ Ensure the file meta info exists and has the correct values for transfer syntax and media storage UIDs.
New in version 1.2.
Warning
The transfer syntax for
is_implicit_VR = False
andis_little_endian = True
is ambiguous and will therefore not be set.- Parameters
enforce_standard (bool, optional) – If
True
, a check for incorrect and missing elements is performed (seevalidate_file_meta()
).
-
formatted_lines
(element_format='%(tag)s %(name)-35.35s %(VR)s: %(repval)s', sequence_element_format='%(tag)s %(name)-35.35s %(VR)s: %(repval)s', indent_format=None)¶ Iterate through the
Dataset
yielding formattedstr
for each element.- Parameters
element_format (str) – The string format to use for non-sequence elements. Formatting uses the attributes of
DataElement
. Default is"%(tag)s %(name)-35.35s %(VR)s: %(repval)s"
.sequence_element_format (str) – The string format to use for sequence elements. Formatting uses the attributes of
DataElement
. Default is"%(tag)s %(name)-35.35s %(VR)s: %(repval)s"
indent_format (str or None) – Placeholder for future functionality.
- Yields
str – A string representation of an element.
-
classmethod
from_json
(json_dataset, bulk_data_uri_handler=None)¶ Add elements to the
Dataset
from DICOM JSON format.New in version 1.3.
See the DICOM Standard, Part 18, Annex F.
- Parameters
json_dataset (dict or str) –
dict
orstr
representing a DICOM Data Set formatted based on the DICOM JSON Model.bulk_data_uri_handler (callable, optional) – Callable function that accepts the “BulkDataURI” of the JSON representation of a data element and returns the actual value of data element (retrieved via DICOMweb WADO-RS).
- Returns
- Return type
-
get
(key, default=None)¶ Simulate
dict.get()
to handle element tags and keywords.- Parameters
key (str or int or Tuple[int, int] or BaseTag) – The element keyword or tag or the class attribute name to get.
default (obj or None, optional) – If the element or class attribute is not present, return default (default
None
).
- Returns
value – If key is the keyword for an element in the
Dataset
then return the element’s value.dataelem.DataElement – If key is a tag for a element in the
Dataset
then return theDataElement
instance.value – If key is a class attribute then return its value.
-
get_item
(key)¶ Return the raw data element if possible.
It will be raw if the user has never accessed the value, or set their own value. Note if the data element is a deferred-read element, then it is read and converted before being returned.
- Parameters
key – The DICOM (group, element) tag in any form accepted by
Tag()
such as[0x0010, 0x0010]
,(0x10, 0x10)
,0x00100010
, etc. May also be aslice
made up of DICOM tags.- Returns
The corresponding element.
- Return type
-
get_private_item
(group, element_offset, private_creator)¶ Return the data element for the given private tag group.
New in version 1.3.
This is analogous to
Dataset.__getitem__()
, but only for private tags. This allows to find the private tag for the correct private creator without the need to add the tag to the private dictionary first.- Parameters
group (int) – The private tag group where the item is located as a 32-bit int.
element_offset (int) – The lower 16 bits (e.g. 2 hex numbers) of the element tag.
private_creator (str) – The private creator for the tag. Must match the private creator for the tag to be returned.
- Returns
The corresponding element.
- Return type
- Raises
ValueError – If group is not part of a private tag or private_creator is empty.
KeyError – If the private creator tag is not found in the given group. If the private tag is not found.
-
property
is_original_encoding
¶ Return
True
if the encoding to be used for writing is set and is the same as that used to originally encode theDataset
.New in version 1.1.
This includes properties related to endianess, VR handling and the (0008,0005) Specific Character Set.
-
items
()¶ Return the
Dataset
items to simulatedict.items()
.- Returns
The top-level (
BaseTag
,DataElement
) items for theDataset
.- Return type
dict_items
-
iterall
()¶ Iterate through the
Dataset
, yielding all the elements.Unlike
Dataset.__iter__()
, this does recurse into sequences, and so yields all elements as if the file were “flattened”.- Yields
dataelem.DataElement
-
overlay_array
(group)¶ Return the Overlay Data in group as a
numpy.ndarray
.New in version 1.4.
- Returns
The (group,3000) Overlay Data converted to a
numpy.ndarray
.- Return type
numpy.ndarray
-
property
pixel_array
¶ Return the pixel data as a
numpy.ndarray
.Changed in version 1.4: Added support for Float Pixel Data and Double Float Pixel Data
- Returns
The (7FE0,0008) Float Pixel Data, (7FE0,0009) Double Float Pixel Data or (7FE0,0010) Pixel Data converted to a
numpy.ndarray
.- Return type
numpy.ndarray
-
pop
(key, *args)¶ Emulate
dict.pop()
with support for tags and keywords.Removes the element for key if it exists and returns it, otherwise returns a default value if given or raises
KeyError
.- Parameters
key (int or str or 2-tuple) –
If
tuple
- the group and element number of the DICOM tagIf
int
- the combined group/element numberIf
str
- the DICOM keyword of the tag
*args (zero or one argument) – Defines the behavior if no tag exists for key: if given, it defines the return value, if not given,
KeyError
is raised
- Returns
- Return type
The element for key if it exists, or the default value if given.
- Raises
KeyError – If the key is not a valid tag or keyword. If the tag does not exist and no default is given.
-
popitem
()¶ Remove and return a (key, value) pair as a 2-tuple.
Pairs are returned in LIFO (last-in, first-out) order. Raises KeyError if the dict is empty.
-
private_block
(group, private_creator, create=False)¶ Return the block for the given tag group and private_creator.
New in version 1.3.
If create is
True
and the private_creator does not exist, the private creator tag is added.Notes
We ignore the unrealistic case that no free block is available.
- Parameters
group (int) – The group of the private tag to be found as a 32-bit
int
. Must be an odd number (e.g. a private group).private_creator (str) – The private creator string associated with the tag.
create (bool, optional) – If
True
and private_creator does not exist, a new private creator tag is added at the next free block. IfFalse
(the default) and private_creator does not exist,KeyError
is raised instead.
- Returns
The existing or newly created private block.
- Return type
- Raises
ValueError – If group doesn’t belong to a private tag or private_creator is empty.
KeyError – If the private creator tag is not found in the given group and the create parameter is
False
.
-
private_creators
(group)¶ Return a list of private creator names in the given group.
New in version 1.3.
Examples
This can be used to check if a given private creator exists in the group of the dataset:
>>> ds = Dataset() >>> if 'My Creator' in ds.private_creators(0x0041): ... block = ds.private_block(0x0041, 'My Creator')
- Parameters
group (int) – The private group as a 32-bit
int
. Must be an odd number.- Returns
All private creator names for private blocks in the group.
- Return type
list of str
- Raises
ValueError – If group is not a private group.
Remove all private elements from the
Dataset
.
-
save_as
(filename, write_like_original=True)¶ Write the
Dataset
to filename.Wrapper for pydicom.filewriter.dcmwrite, passing this dataset to it. See documentation for that function for details.
See also
pydicom.filewriter.dcmwrite()
Write a DICOM file from a
FileDataset
instance.
-
set_original_encoding
(is_implicit_vr, is_little_endian, character_encoding)¶ Set the values for the original transfer syntax and encoding.
New in version 1.2.
Can be used for a
Dataset
with raw data elements to enable optimized writing (e.g. without decoding the data elements).
-
setdefault
(key, default=None)¶ Emulate
dict.setdefault()
with support for tags and keywords.Examples
>>> ds = Dataset() >>> elem = ds.setdefault((0x0010, 0x0010), "Test") >>> elem (0010, 0010) Patient's Name PN: 'Test' >>> elem.value 'Test' >>> elem = ds.setdefault('PatientSex', ... DataElement(0x00100040, 'CS', 'F')) >>> elem.value 'F'
- Parameters
key (int or str or 2-tuple) –
If
tuple
- the group and element number of the DICOM tagIf
int
- the combined group/element numberIf
str
- the DICOM keyword of the tag
default (type, optional) – The default value that is inserted and returned if no data element exists for the given key. If it is not of type
DataElement
, one will be constructed instead for the given tag and default as value. This is only possible for known tags (e.g. tags found via the dictionary lookup).
- Returns
The data element for key if it exists, or the default value if it is a
DataElement
orNone
, or aDataElement
constructed with default as value.- Return type
DataElement or type
- Raises
KeyError – If the key is not a valid tag or keyword. If no tag exists for key, default is not a
DataElement
and notNone
, and key is not a known DICOM tag.
-
to_json
(bulk_data_threshold=1024, bulk_data_element_handler=None, dump_handler=None)¶ Return a JSON representation of the
Dataset
.New in version 1.3.
See the DICOM Standard, Part 18, Annex F.
- Parameters
bulk_data_threshold (int, optional) – Threshold for the length of a base64-encoded binary data element above which the element should be considered bulk data and the value provided as a URI rather than included inline (default:
1024
). Ignored if no bulk data handler is given.bulk_data_element_handler (callable, optional) – Callable function that accepts a bulk data element and returns a JSON representation of the data element (dictionary including the “vr” key and either the “InlineBinary” or the “BulkDataURI” key).
dump_handler (callable, optional) –
Callable function that accepts a
dict
and returns the serialized (dumped) JSON string (by default usesjson.dumps()
).
- Returns
Dataset
serialized into a string based on the DICOM JSON Model.- Return type
str
Examples
>>> def my_json_dumps(data): ... return json.dumps(data, indent=4, sort_keys=True) >>> ds.to_json(dump_handler=my_json_dumps)
-
to_json_dict
(bulk_data_threshold=1024, bulk_data_element_handler=None)¶ Return a dictionary representation of the
Dataset
conforming to the DICOM JSON Model as described in the DICOM Standard, Part 18, Annex F.New in version 1.4.
- Parameters
bulk_data_threshold (int, optional) – Threshold for the length of a base64-encoded binary data element above which the element should be considered bulk data and the value provided as a URI rather than included inline (default:
1024
). Ignored if no bulk data handler is given.bulk_data_element_handler (callable, optional) – Callable function that accepts a bulk data element and returns a JSON representation of the data element (dictionary including the “vr” key and either the “InlineBinary” or the “BulkDataURI” key).
- Returns
Dataset
representation based on the DICOM JSON Model.- Return type
dict
-
top
()¶ Return a
str
representation of the top level elements.
-
trait_names
()¶ Return a
list
of valid names for auto-completion code.Used in IPython, so that data element names can be found and offered for autocompletion on the IPython command line.
-
update
(dictionary)¶ Extend
dict.update()
to handle DICOM tags and keywords.
-
values
()¶ Return the
Dataset
values to simulatedict.values()
.- Returns
The
DataElements
that make up the values of theDataset
.- Return type
dict_keys
-
walk
(callback, recursive=True)¶ Iterate through the
Dataset's
elements and run callback on each.Visit all elements in the
Dataset
, possibly recursing into sequences and their items. The callback function is called for eachDataElement
(including elements with a VR of ‘SQ’). Can be used to perform an operation on certain types of elements.For example,
remove_private_tags()
finds all elements with private tags and deletes them.The elements will be returned in order of increasing tag number within their current
Dataset
.- Parameters
callback –
A callable function that takes two arguments:
a
Dataset
a
DataElement
belonging to thatDataset
recursive (bool, optional) – Flag to indicate whether to recurse into sequences (default
True
).
-