This page was generated from doc/user_guide/load_save_data.ipynb. Interactive online version: Binder badge.

Load and save data

Load patterns

From a file

kikuchipy can read and write experimental EBSD patterns and EBSD master patterns from/to multiple formats (see supported formats). To load patterns from file use the load() function. Let’s import the necessary libraries and read the Nickel EBSD test data set directly from file (not via kikuchipy.data.nickel_ebsd_small()):

[1]:
# Exchange inline for notebook or qt5 (from pyqt) for interactive plotting
%matplotlib inline

import tempfile
import dask.array as da
import hyperspy.api as hs
import kikuchipy as kp
import numpy as np
import matplotlib.pyplot as plt


datadir = "../../kikuchipy/data/"
nordif_ebsd = "nordif/Pattern.dat"
s = kp.load(datadir + nordif_ebsd)
s
WARNING:hyperspy.api:The ipywidgets GUI elements are not available, probably because the hyperspy_gui_ipywidgets package is not installed.
WARNING:hyperspy.api:The traitsui GUI elements are not available, probably because the hyperspy_gui_traitsui package is not installed.
[1]:
<EBSD, title: Pattern, dimensions: (3, 3|60, 60)>

Or, load the stereographic projection of the northern hemisphere of an EBSD master pattern for a 20 keV beam energy from a modified version of EMsoft’s master pattern file, returned from their EMEBSDmaster.f90 program:

[2]:
emsoft_master_pattern = (
 "emsoft_ebsd_master_pattern/ni_mc_mp_20kv_uint8_gzip_opts9.h5"
)
s_mp = kp.load(filename=datadir + emsoft_master_pattern)
s_mp
[2]:
<EBSDMasterPattern, title: ni_mc_mp_20kv_uint8_gzip_opts9, dimensions: (|401, 401)>

Both the stereographic and the square Lambert projections of this master pattern data is available via kikuchipy.data.nickel_ebsd_master_pattern_small().

All file readers support accessing the data without loading it into memory (with the Dask library), which can be useful when processing large data sets to avoid memory errors:

[3]:
s_lazy = kp.load(datadir + nordif_ebsd, lazy=True)
print(s_lazy)
s_lazy.data
<LazyEBSD, title: Pattern, dimensions: (3, 3|60, 60)>
[3]:
Array Chunk
Bytes 31.64 kiB 31.64 kiB
Shape (3, 3, 60, 60) (3, 3, 60, 60)
Count 2 Tasks 1 Chunks
Type uint8 numpy.ndarray
3 1 60 60 3

Parts or all of the data can be read into memory by calling compute():

[4]:
s_lazy_copy = s_lazy.inav[:2, :].deepcopy()
s_lazy_copy.compute()
s_lazy_copy
[########################################] | 100% Completed |  0.1s
[4]:
<EBSD, title: Pattern, dimensions: (2, 3|60, 60)>
[5]:
s_lazy.compute()
s_lazy
[########################################] | 100% Completed |  0.1s
[5]:
<EBSD, title: Pattern, dimensions: (3, 3|60, 60)>

Note

When lazily loaded EBSD patterns are processed, they are processed chunk by chunk, which in many cases leads to longer processing times, so processing lazy data sets should be done with some care. See the relevant HyperSpy user guide for information on how to do this.

Visualization of data is done by navigating navigation space, showing the signal in each navigation point:

[6]:
s.plot()
../_images/user_guide_load_save_data_12_0.png
../_images/user_guide_load_save_data_12_1.png

Upon loading, kikuchipy tries to read all scan information from the file and stores everything it can read in the original_metadata attribute:

[7]:
# s.original_metadata  # Long output

Also, some information may be stored in a standard location in the metadata attribute where it can be used by EBSD class methods:

[8]:
s.metadata
[8]:
  • Acquisition_instrument
    • SEM
      • Detector
        • EBSD
          • azimuth_angle = 0.0
          • binning = 1
          • detector = NORDIF UF1100
          • elevation_angle = 0.0
          • exposure_time = 0.0035
          • frame_number = -1
          • frame_rate = 202
          • gain = 0.0
          • grid_type = square
          • manufacturer = NORDIF
          • sample_tilt = 70.0
          • scan_time = 148
          • static_background = array([[84, 87, 90, ..., 27, 29, 30], [87, 90, 93, ..., 27, 28, 30], ... 80, 82, ..., 28, 26, 26], [76, 78, 80, ..., 26, 26, 25]], dtype=uint8)
          • version = 3.1.2
          • xpc = -1.0
          • ypc = -1.0
          • zpc = -1.0
      • beam_energy = 20.0
      • magnification = 200
      • microscope = Hitachi SU-6600
      • working_distance = 24.7
  • General
    • original_filename = ../../kikuchipy/data/nordif/Pattern.dat
    • title = Pattern
  • Sample
    • Phases
      • 1
        • atom_coordinates
          • 1
            • atom =
            • coordinates = array([0., 0., 0.])
            • debye_waller_factor = 0.0
            • site_occupation = 0.0
        • formula =
        • info =
        • lattice_constants = array([0., 0., 0., 0., 0., 0.])
        • laue_group =
        • material_name = Ni
        • point_group =
        • setting = 0
        • source =
        • space_group = 0
        • symmetry = 0
  • Signal
    • binned = False
    • signal_type = EBSD

Warning

The Acquisition_instrument.SEM.Detector.EBSD and Sample.Phases metadata nodes are deprecated and will be removed in v0.6.

There are three main reasons for this change: the first is that only the static background array stored in the Acquisition_instrument.SEM.Detector.EBSD.static_background node is used internally, and so the remaining metadata is unnecessary. The background pattern will be stored in its own EBSD.static_background property instead. The second is that keeping track of the unnecessary metadata makes writing and maintaining input/ouput plugins challenging. The third is that the EBSD.xmap and EBSD.detector properties, which keeps track of the CrystalMap and EBSDDetector for a signal, respectively, should be used instead of the more “static” metadata.

The number of patterns in horizontal and vertical direction, pattern size in pixels, scan step size and detector pixel size is stored in the axes_manager attribute:

[9]:
print(s.axes_manager)  # Just "s.axes_manager" looks bad on a dark background
<Axes manager, axes: (3, 3|60, 60)>
            Name |   size |  index |  offset |   scale |  units
================ | ====== | ====== | ======= | ======= | ======
               x |      3 |      0 |       0 |     1.5 |     um
               y |      3 |      0 |       0 |     1.5 |     um
---------------- | ------ | ------ | ------- | ------- | ------
              dx |     60 |        |       0 |       1 |     um
              dy |     60 |        |       0 |       1 |     um

This information can be modified directly, and information in metadata and axes_manager can also be modified by the EBSD class methods set_experimental_parameters(), set_phase_parameters(), set_scan_calibration() and set_detector_calibration(). For example, to set or change the accelerating voltage, horizontal pattern centre coordinate and static background pattern (stored as a numpy.ndarray):

[10]:
s.set_experimental_parameters(
    beam_energy=15,
    xpc=0.5073,
    static_background=plt.imread(
        datadir + "nordif/Background acquisition pattern.bmp"
    )
)
/home/docs/checkouts/readthedocs.org/user_builds/kikuchipy/envs/stable/lib/python3.8/site-packages/kikuchipy/signals/ebsd.py:165: VisibleDeprecationWarning: Function `set_experimental_parameters()` is deprecated and will be removed in version 0.6.
  def set_experimental_parameters(

In addition to the HyperSpy provided metadata, original_metadata and axes_manager properties, kikuchipy tries to read a CrystalMap object with indexing results into a xmap property and an EBSDDetector object into a detector property:

[11]:
s.xmap  # This is empty unless it is set
[12]:
s.detector
[12]:
EBSDDetector (60, 60), px_size 1.0 um, binning 1, tilt 0, azimuthal 0, pc (0.5, 0.5, 0.5)

From a NumPy array

An EBSD or EBSDMasterPattern signal can also be created directly from a numpy.ndarray. To create a data set of (60 x 60) pixel patterns in a (10 x 20) grid, i.e. 10 and 20 patterns in the horizontal and vertical scan directions respectively, of random intensities:

[13]:
s_np = kp.signals.EBSD(np.random.random((20, 10, 60, 60)))
s_np
[13]:
<EBSD, title: , dimensions: (10, 20|60, 60)>

From a Dask array

When processing large data sets, it is useful to load data lazily with the Dask library. This can be done upon reading patterns from a file by setting lazy=True when using the load() function, or directly from a dask.array.Array:

[14]:
s_da = kp.signals.LazyEBSD(
    da.random.random((20, 10, 60, 60), chunks=(2, 10, 60, 60))
)
print(s_da)
s_da.data
<LazyEBSD, title: , dimensions: (10, 20|60, 60)>
[14]:
Array Chunk
Bytes 5.49 MiB 562.50 kiB
Shape (20, 10, 60, 60) (2, 10, 60, 60)
Count 10 Tasks 10 Chunks
Type float64 numpy.ndarray
20 1 60 60 10

From a HyperSpy signal

HyperSpy provides the method set_signal_type() to change between BaseSignal subclasses, of which EBSD, EBSDMasterPattern and VirtualBSEImage are three. To get one of these objects from a HyperSpy Signal2D object:

[15]:
s_hs = hs.signals.Signal2D(np.random.random((20, 10, 60, 60)))
s_hs
[15]:
<Signal2D, title: , dimensions: (10, 20|60, 60)>
[16]:
s_hs.set_signal_type("EBSD")
s_hs
[16]:
<EBSD, title: , dimensions: (10, 20|60, 60)>
[17]:
s_hs.set_signal_type("VirtualBSEImage")
s_hs
[17]:
<VirtualBSEImage, title: , dimensions: (10, 20|60, 60)>
[18]:
s_hs.set_signal_type("EBSDMasterPattern")
s_hs
[18]:
<EBSDMasterPattern, title: , dimensions: (10, 20|60, 60)>

Save patterns

To save experimental EBSD patterns to file use the save() method. For example, to save an EBSD signal in an HDF5 file, with file name patterns.h5, in our default h5ebsd format:

[19]:
temp_dir = tempfile.mkdtemp() + "/"
s.save(temp_dir + "patterns")

Warning

If we want to overwrite an existing file:

s.save("patterns.h5", overwrite=True)

If we want to save patterns in NORDIF’s binary .dat format instead:

[20]:
s.save(temp_dir + "patterns.dat")

To save an EBSDMasterPattern to an HDF5 file, we use the save method inherited from HyperSpy to write to their HDF5 specification:

[21]:
s_hs.save(temp_dir + "master_pattern.hspy")
s_hs
[21]:
<EBSDMasterPattern, title: , dimensions: (10, 20|60, 60)>

These master patterns can then be read into an EBSDMasterPattern signal again via HyperSpy’s load():

[22]:
s_mp2 = hs.load(
    temp_dir + "master_pattern.hspy", signal_type="EBSDMasterPattern"
)
s_mp2
[22]:
<EBSDMasterPattern, title: , dimensions: (10, 20|60, 60)>

Note

To save results from statistical decomposition (machine learning) of patterns to file see the section Saving and loading results in HyperSpy’s user guide. Note that the file extension .hspy must be used upon saving, s.save('patterns.hspy'), as the default extension in kikuchipy, .h5, yields a kikuchipy h5ebsd file where the decomposition results aren’t saved. The saved patterns can then be reloaded using HyperSpy’s load() function and passing the signal_type="EBSD" parameter as explained above.


Supported EBSD formats

Currently, kikuchipy has readers and writers for the following formats:

Note

If you want to process your patterns with kikuchipy, but use an unsupported EBSD vendor software, or if you want to write your processed patterns to a vendor format that does not support writing, please request this feature in our issue tracker.

EMsoft simulated EBSD HDF5

Dynamically simulated EBSD patterns returned by EMsoft’s EMEBSD.f90 program as HDF5 files can be read as an EBSD signal:

[23]:
emsoft_ebsd = "emsoft_ebsd/simulated_ebsd.h5"  # Dummy data set
s_sim = kp.load(filename=datadir + emsoft_ebsd)
s_sim
[23]:
<EBSD, title: simulated_ebsd, dimensions: (10|10, 10)>

Here, the EMsoft simulated EBSD file_reader() is called, which takes the optional argument scan_size. Passing scan_size=(2, 5) will reshape the pattern data shape from (10, 10, 10) to (2, 5, 10, 10):

[24]:
s_sim2 = kp.load(filename=datadir + emsoft_ebsd, scan_size=(2, 5))
print(s_sim2)
print(s_sim2.data.shape)
<EBSD, title: simulated_ebsd, dimensions: (5, 2|10, 10)>
(2, 5, 10, 10)

Simulated EBSD patterns can be written to the kikuchipy h5ebsd format, the NORDIF binary format, or to HDF5 files using HyperSpy’s HDF5 specification as explained above.

EMsoft EBSD master pattern HDF5

Master patterns returned by EMsoft’s EMEBSDmaster.f90 program as HDF5 files can be read as an EBSDMasterPattern signal:

[25]:
s_mp = kp.load(filename=datadir + emsoft_master_pattern)

print(s_mp)
print(s_mp.projection)
print(s_mp.hemisphere)
print(s_mp.phase)
<EBSDMasterPattern, title: ni_mc_mp_20kv_uint8_gzip_opts9, dimensions: (|401, 401)>
stereographic
north
<name: ni. space group: Fm-3m. point group: m-3m. proper point group: 432. color: tab:blue>

Here, the EMsoft EBSD master pattern file_reader() is called, which takes the optional arguments projection, hemisphere and energy. The stereographic projection is read by default. Passing projection="lambert" will read the square Lambert projection instead. The northern hemisphere is read by default. Passing hemisphere="south" or hemisphere="both" will read the southern hemisphere projection or both, respectively. Master patterns for all beam energies are read by default. Passing energy=(10, 20) or energy=15 will read the master pattern(s) with beam energies from 10 to 20 keV, or just 15 keV, respectively:

[26]:
s_mp = kp.load(
    datadir + emsoft_master_pattern,
    projection="lambert",
    hemisphere="both",
    energy=20
)

print(s_mp)
print(s_mp.projection)
print(s_mp.hemisphere)
<EBSDMasterPattern, title: ni_mc_mp_20kv_uint8_gzip_opts9, dimensions: (2|401, 401)>
lambert
both

Master patterns can be written to HDF5 files using HyperSpy’s HDF5 specification as explained above.

See [JPDeGraef19] for a hands-on tutorial explaining how to simulate these patterns with EMsoft, and [CDeGraef13] for details of the underlying theory.

h5ebsd

The h5ebsd format [JGU+14] is based on the HDF5 open standard (Hierarchical Data Format version 5). HDF5 files can be read and edited using e.g. the HDF Group’s reader HDFView or the Python package used by kikuchipy, h5py. Upon loading an HDF5 file with extension .h5, .hdf5, or .h5ebsd, the correct reader is determined from the file. Supported h5ebsd formats are listed in the table above.

If an h5ebsd file contains multiple scans, as many scans as desirable can be read from the file. For example, if the file contains two scans with names My awes0m4 Xcan #! with a long title and Scan 2:

[27]:
kikuchipy_ebsd = "kikuchipy_h5ebsd/patterns.h5"
s_awsm, s2 = kp.load(
    filename=datadir + kikuchipy_ebsd,
    scan_group_names=["My awes0m4 Xcan #! with a long title", "Scan 2"]
)
print(s_awsm)
print(s2)
<EBSD, title: patterns My awes0m4 ..., dimensions: (3, 3|60, 60)>
<EBSD, title: patterns Scan 2, dimensions: (3, 3|60, 60)>

Here, the h5ebsd file_reader() is called. If only Scan 2 is to be read, scan_group_names="Scan 2" can be passed:

[28]:
s2 = kp.load(filename=datadir + kikuchipy_ebsd, scan_group_names="Scan 2")
s2
[28]:
<EBSD, title: patterns Scan 2, dimensions: (3, 3|60, 60)>

The scan_group_names parameter is unnecessary if only the first scan in the file is to be read, since reading only the first scan in the file is the default behaviour.

So far, only saving patterns to kikuchipy’s own h5ebsd format is supported. It is possible to write a new scan with a scan name Scan x, where x is an integer, to an existing, but closed, h5ebsd file in the kikuchipy format, e.g. one containing only Scan 1, by passing:

[29]:
new_file = "patterns_new.h5"
s2.save(temp_dir + new_file, scan_number=1)
s_awsm.save(filename=temp_dir + new_file, add_scan=True, scan_number=2)

s2_new, s_awsm_new = kp.load(
    filename=temp_dir + new_file, scan_group_names=["Scan 1", "Scan 2"]
)
print(s2_new)
print(s_awsm_new)
<EBSD, title: patterns_new Scan 1, dimensions: (3, 3|60, 60)>
<EBSD, title: patterns_new Scan 2, dimensions: (3, 3|60, 60)>

Here, the h5ebsd file_writer() is called.

Note

The EBSD.xmap and EBSD.detector properties are so far not written to this file format.

NORDIF binary

Patterns acquired using NORDIF’s acquisition software are stored in a binary file usually named “Pattern.dat”. Scan information is stored in a separate text file usually named “Setting.txt”, and both files usually reside in the same directory. If this is the case, the patterns can be loaded by passing the file name as the only parameter. If this is not the case, the setting file can be passed upon loading:

[30]:
s_nordif = kp.load(
    filename=datadir + nordif_ebsd, setting_file=datadir + "nordif/Setting.txt"
)
s_nordif
[30]:
<EBSD, title: Pattern, dimensions: (3, 3|60, 60)>

Here, the NORDIF file_reader() is called. If the scan information, i.e. scan and pattern size, in the setting file is incorrect or the setting file is not available, patterns can be loaded by passing:

[31]:
s_nordif = kp.load(
    filename=datadir + nordif_ebsd, scan_size=(1, 9), pattern_size=(60, 60)
)
s_nordif
[31]:
<EBSD, title: Pattern, dimensions: (9|60, 60)>

If a static background pattern named “Background acquisition pattern.bmp” is stored in the same directory as the pattern file, this is stored in metadata upon loading.

Patterns can also be saved to a NORDIF binary file, upon which the NORDIF file_writer() is called. Note, however, that so far no new setting file, background pattern, or calibration patterns are created upon saving.

NORDIF calibration patterns

NORDIF calibration patterns in bitmap format named “Calibration (x,y).bmp”, where “x” and “y” correspond to coordinates listed in the NORDIF setting file, usually named “Setting.txt”, can be loaded

[32]:
s_nordif_cal = kp.load(filename=datadir + "nordif/Setting.txt")
s_nordif_cal
[32]:
<EBSD, title: Calibration patterns, dimensions: (2|60, 60)>

Here, the NORDIF calibration patterns file_reader() is called. Lazy loading is not supported for this reader, thus the lazy parameter is not used.

If a static background pattern named “Background calibration pattern.bmp” is stored in the same directory as the pattern file, this is stored in metadata upon loading.

Oxford Instruments binary

Uncompressed patterns stored in the Oxford Instruments binary .ebsp file format, with intensities as 8-bit or 16-bit unsigned integer, can be read

[33]:
oxford_binary_path = datadir + "oxford_binary/"
s_oxford = kp.load(filename=oxford_binary_path + "patterns.ebsp")
s_oxford
[33]:
<EBSD, title: patterns, dimensions: (3, 3|60, 60)>

Here, the Oxford Instruments binary file_reader() is called.

Every pattern’s flattened index into the 2D navigation map, as well as their entry in the file (map order isn’t always the same as file order) can be retrieved from s_oxford.original_metadata.map1d_id and s_oxford.original_metadata.file_order, respectively. If available in the file, every pattern’s row and column beam position in microns can be retrieved from s_oxford.original_metadata.beam_y and s_oxford.original_metadata.beam_x, respectively. All these are 1D arrays.

[34]:
s_oxford.original_metadata
[34]:
  • beam_x = array([0. , 1.5, 3. , 0. , 1.5, 3. , 0. , 1.5, 3. ])
  • beam_y = array([0. , 0. , 0. , 1.5, 1.5, 1.5, 3. , 3. , 3. ])
  • file_order = array([8, 0, 1, 2, 3, 4, 5, 6, 7])
  • map1d_id = array([0, 1, 2, 3, 4, 5, 6, 7, 8])

If the beam positions aren’t present in the file, the returned signal will have a single navigation dimension the size of the number of patterns.

Files with only the non-indexed patterns can also be read. The returned signal will then have a single navigation dimension the size of the number of patterns. The flattened index into the 2D navigation map mentioned above can be useful to determine the location of each non-indexed pattern.

From kikuchipy into other software

Patterns saved in the h5ebsd format can be read by the dictionary indexing and related routines in EMsoft using the EMEBSD reader. Those routines in EMsoft also have a NORDIF reader.

Patterns saved in the h5ebsd format can of course be read in Python like any other HDF5 data set:

[35]:
import h5py
with h5py.File(datadir + kikuchipy_ebsd, mode="r") as f:
    dset = f['Scan 2/EBSD/Data/patterns']
    print(dset)
    patterns = dset[()]
    print(patterns.shape)
    plt.figure()
    plt.imshow(patterns[0], cmap="gray")
    plt.axis("off")
<HDF5 dataset "patterns": shape (9, 60, 60), type "|u1">
(9, 60, 60)
../_images/user_guide_load_save_data_75_1.png

Load and save virtual BSE images

One or more virtual backscatter electron (BSE) images in a VirtualBSEImage signal can be read and written to file using one of HyperSpy’s many readers and writers. If they are only to be used internally in HyperSpy, they can be written to and read back from HyperSpy’s HDF5 specification as explained above for EBSD master patterns.

If we want to write the images to image files, HyperSpy also provides a series of image readers/writers, as explained in their IO user guide. If we wanted to write them as a stack of TIFF images:

[36]:
# Get virtual image from generator
vbse_gen = kp.generators.VirtualBSEGenerator(s)
print(vbse_gen)

print(vbse_gen.grid_shape)
vbse = vbse_gen.get_images_from_grid()
print(vbse)
VirtualBSEGenerator for <EBSD, title: Pattern, dimensions: (3, 3|60, 60)>
(5, 5)
<VirtualBSEImage, title: , dimensions: (5, 5|3, 3)>
[37]:
vbse.rescale_intensity()
vbse.unfold_navigation_space()  # 1D navigation space required for TIFF
vbse
Rescaling the image intensities:
[########################################] | 100% Completed |  0.1s
[37]:
<VirtualBSEImage, title: , unfolded dimensions: (25|3, 3)>
[38]:
vbse_fname = "vbse.tif"
vbse.save(temp_dir + vbse_fname)  # Easily read into e.g. ImageJ

We can also write them to e.g. png or bmp files with Matplotlib:

[39]:
nav_size = vbse.axes_manager.navigation_size
_ = [
    plt.imsave(temp_dir + f"vbse{i}.png", vbse.inav[i].data)
    for i in range(nav_size)
]

Read the TIFF stack back into a VirtualBSEImage signal:

[40]:
vbse2 = hs.load(temp_dir + vbse_fname, signal_type="VirtualBSEImage")
vbse2
WARNING:tifffile.tifffile:ImageJ series: detected extra dimension 'I'
[40]:
<VirtualBSEImage, title: , dimensions: (25|3, 3)>