Data Conversion Tutorial#
up4
is built on the HDF5 data format. To use up4
we must first convert data to HDF5.
Up to this point there are two supported filetypes that we can convert from: csv and vtk.
Inside the vtk format, this includes both the legacy ASCII format (.vtk) and also the modern, xml-based format.
CSV#
CSV files can be converted in the following manner:
import up4
up4.Converter.csv(
'path/to/data.csv', # Path to the csv file
'output.hdf5', # filename to write
columns = [0,1,2,3], # Select the columns for to read, pointing to the time, x-, y-, z-positions
delimiter = ',', # Delimiter used in the csv file
header = True, # Does the csv file have a header?
comment = '#', # Comment character
vel = True, # Do you want to calculate the velocity
interpolate = True, # Do you want to interpolate the data
radius = 0.1, # Radius of the particle
)
It is strongly recommended
to interpolate the data. up4
sometimes relies on the spacing between the data points to be constant. This also helps to remove the effects of
sample rate when comparing between datasets accquired by different techniques.
If the csv contains velocity information you can read it in by simply extending the columns vector from 4 to 7 elements, pointing to t, x, y, z, vx, vy, vz.
VTK#
The vtk reader is developed to read both legacy .vtk files generated by DEM engines such as LIGGGHTS and modern xml-based files generated by other DEM engines such as Lethe.
These different filetypes require different conversion methods, which up4
dispatches to based on the file extension given in the filter
argument.
Warning
The file extension in the filter must match the file extension of the files you are trying to convert. If you are trying to convert a .vtu file, the filter must be set to .vtu. If you are trying to convert a .vtk file, the filter must be set to .vtk.
Legacy VTK#
There are two ways to convert vtk files, either with up4.Converter.vtk
, which requires a list of filenames as input, or with up4.Converter.vtk_from_folder
which will read all vtk files in a folder and convert them to hdf5.
The list of filenames in up4.Converter.vtk
must be sorted into a natural and not
lexicographical order.
This basically means that the sorting order should consider numbers in a file as numbers
and not sort on a per-character basis. In the specific case for files generated by LIGGGHTS, the sub-files generated by LIGGGHTS (any involving boundingBox in the name) must also be removed.
The code below shows how to do this:
from glob import glob
from natsort import natsorted
files = glob('path/to/folder/*.vtk')
files = [f for f in files if not "boundingBox" in f] # remove LIGGGHTS sub-files
files = natsorted(files)
conversion can then be done with:
import up4
up4.Converter.vtk(
files, # Sorted list of filenames
1e-5, # timestep of the simulation
'output.hdf5', # filename to write
r"(\d+).vtk", # regex to extract the timestep from the filename
)
The regex filter used in up4.Converter.vtk
is used to extract the timestep from the filename. The regex must contain a group of numbers.
Read more about regex here. The default regex filter should work in most cases.
The function up4.Converter.vtk_from_folder
is a wrapper around up4.Converter.vtk
and can be used as follows:
import up4
up4.Converter.vtk_from_folder(
'path/to/folder', # Path to the folder containing the vtk files
1e-5, # timestep of the simulation
'output.hdf5', # filename to write
r"(\d+).vtk", # regex to extract the timestep from the filename
)
The field names arguments proceeding the filter
argument are defaulted to LIGGGHTS naming conventions, but can be changed to match the field names in the vtk files you are trying to convert:
import up4
up4.Converter.vtk(
files, # Sorted list of filenames
1e-5, # timestep of the simulation
'output.hdf5', # filename to write
r"(\d+).vtk", # regex to extract the timestep from the filename
filter = '.vtk', # File extension
velocity_field_name = "Velocity", # Name of the velocity field in the vtk files
)
Modern VTK#
- The converter for the modern VTK formats supports unstructured grid (.vtu) or polydata
- (.vtp) files. It is likely that you will need to specify field names for the velocity,
radius, id and type fields in the vtk files you are trying to convert, as the defaults are set to LIGGGHTS. The
radius_field_name
anddiameter_field_name
arguments are mutually exclusive, and only one is needed. The end result is the same as the diameter values are used to calculate the radius values thatup4
internally uses. Ifdiameter_field_name
is set, this is the value that will be used, regardless of theradius_field_name
argument value.
Like with the legacy VTK converter, up4
can convert either a naturally sorted list of
files (here with a .vtu extension), or look inside a folder and extract the necessary files
itself. Sorting a list of .pvtu files and converting them can be done as follows:
from glob import glob
from natsort import natsorted
files = glob('path/to/folder/*.vtu')
files = natsorted(files)
import up4
up4.Converter.vtk(
files, # Sorted list of filenames
1e-5, # timestep of the simulation
'output.hdf5', # filename to write
r"(\d+).vtu", # regex to extract the timestep from the filename
velocity_field_name = "Velocity", # Name of the velocity field in the vtk files
radius_field_name = "Radius", # Name of the radius field in the vtk files
id_field_name = "id", # Name of the id field in the vtk files
type_field_name = "type", # Name of the type field in the vtk files
)
The function up4.Converter.vtu_from_folder
is a wrapper around up4.Converter.vtu
and can be used as follows:
import up4
up4.Converter.vtk_from_folder(
'path/to/folder', # Path to the folder containing the vtk files
1e-5, # timestep of the simulation
'output.hdf5', # filename to write
r"(\d+).vtu", # regex to extract the timestep from the filename
velocity_field_name = "Velocity", # Name of the velocity field in the vtk files
diameter_field_name = "Diameter", # Name of the diameter field in the vtk files
id_field_name = "id", # Name of the id field in the vtk files
type_field_name = "type", # Name of the type field in the vtk files
)
Dataset Statistics#
Once you have generated your hdf5 file you can read it in using the up4.Data
class.
If you include the class in a normal print function the output may look as following:
import up4
data = up4.Data('output.hdf5')
print(data)
"""
Dimensions of the system:
x -0.07-->0.06
y 0.00-->0.13
z -0.09-->0.01
The max time of this set is : 2.00
Number of Particles: 1
Mean velocity of: 0.44 m/s
Minimum velocity 0.03 m/s
Maximum Velocity 0.74 m/s
"""