# LAMMPS Workload

Follow this procedure to run the LAMMPS molecular dynamics simulator.

- Operating system: Ubuntu* 22.04
- Hardware: Intel® Data Center Max GPUs
- Software: Intel® oneAPI Base toolkit, Intel® oneAPI HPC toolkit
- Time to complete: 30 minutes 

For more information, see the [LAMMPS documentation](https://www.lammps.org/#gsc.tab=0). 

1. Check whether the driver stack is installed.

   ```(bash)
   $ xpu-smi discovery
   ```
   
   The command should return at least one Intel® Data Center GPU Max device.

2. Check whether the oneAPI toolkit is installed.

   ```(bash)
   $ apt list intel-basekit intel-hpckit
   ```
   
   Expected output:
   ```(bash)
   Listing... Done
   intel-basekit/all,now 2023.2.0-49384 amd64 [installed]
   intel-hpckit/all,now 2023.2.0-49438 amd64 [installed]
   ```

3. If you previously have not configured your environment, install the Ubuntu 22.04 graphics driver.  See [dgpu-docs](https://dgpu-docs.intel.com/) for details.

   ```{note} Access to Ubuntu repositories, such as https://repositories.intel.com and https://apt.repos.intel.com, is required for installation. If proxy settings involve changes to environment variables such as http_proxy or https_proxy, small modifications are required in the following steps, such as adding -E (preserve environment) to sudo commands.
   ```
   
4. If you previously have not configured your environment, enable access to the Intel repo serving the oneAPI packages and install the oneAPI Base toolkit and HPC toolkit for Ubuntu 22.04.

   ```bash
   wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB \ | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
   echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo  tee /etc/apt/sources.list.d/oneAPI.list
   sudo apt update
   ```
   
   ```bash
   sudo apt install -y intel-basekit intel-hpckit
   ```

5. Install build dependencies.

   ```(bash)
   sudo apt install -y python3 python3-pip git build-essential cmake
   pip3 install mako pyyaml
   ```

6. Build LAMMPS.

   ```(bash)
   git clone https://github.com/intel/compute-aggregation-layer.git cal
   git clone https://www.github.com/lammps/lammps lammps -b develop --depth 1
   cd lammps/src
   source /opt/intel/oneapi/setvars.sh
   make yes-asphere yes-kspace yes-manybody yes-misc 
   make yes-molecule yes-rigid yes-dpd-basic yes-gpu
   cd ../..
   
   cd cal 
   mkdir build; cd build
   cmake ..
   make -j
   export PATH=`pwd`:$PATH
   cd ../..
   cd lammps/lib/gpu
   make -f Makefile.oneapi -j
   cd ../../src
   make oneapi -j 
   cd ../..
   ```

7. Run LAMMPS. 

   ```(bash)
   cd cal/build
   export PATH=`pwd`:$PATH
   cd ../../lammps/src/INTEL/TEST/
   #ONE-TIME RESTART FILE GENERATION FOR LIQUID CRYSTAL BENCHMARK
   mpirun --bootstrap ssh -np 72 ../../lmp_oneapi -in in.lc_generate_restart -log none
   #Environment Setup - NEO MASTER 026515 or later
   export I_MPI_FABRICS=shm
   export KMP_AFFINITY="granularity=core,scatter"
   export CAL_ASYNC_CALLS=1
   
   #Run Liquid Crystal Benchmark (FOM is timesteps/sec)
   I_MPI_PIN_ORDER=bunch KMP_BLOCKTIME=1000 OMP_NUM_THREADS=4 I_MPI_PIN_DOMAIN=8:compact calrun mpirun \
     --bootstrap ssh -np 16 ../../lmp_oneapi -v N off -in in.intel.lc -log none -pk gpu 2 -sf gpu
   
   ```

The following example presents an output from Intel® Data Center GPU Max 1550:

```
--------------------------------------------------------------------------
- Using acceleration for gayberne:
-  with 8 proc(s) per device.
-  with 4 thread(s) per proc.
-  with OpenCL Parameters for: INTEL_GPU (500)
-  Horizontal vector operations: ENABLED
-  Shared memory system: No
--------------------------------------------------------------------------
Platform: Intel(R) Corporation Intel(R) OpenCL Graphics OpenCL 3.0
Device 0: Intel(R) Data Center GPU Max 1550, 448 CUs, 61 GB, 1.6 GHZ (Mixed Precision)
Device 1: Intel(R) Data Center GPU Max 1550, 448 CUs, 1.6 GHZ (Mixed Precision)
--------------------------------------------------------------------------

Initializing Device and compiling on process 0...Done.
Initializing Devices 0-1 on core 0...Done.
Initializing Devices 0-1 on core 1...Done.
Initializing Devices 0-1 on core 2...Done.
Initializing Devices 0-1 on core 3...Done.
Initializing Devices 0-1 on core 4...Done.
Initializing Devices 0-1 on core 5...Done.
Initializing Devices 0-1 on core 6...Done.
Initializing Devices 0-1 on core 7...Done.

Generated 0 of 0 mixed pair_coeff terms from geometric mixing rule
Setting up Verlet run ...
  Unit style    : lj
  Current step  : 10
  Time step     : 0.002
Per MPI rank memory allocation (min/avg/max) = 36.96 | 36.96 | 36.96 Mbytes
   Step          Temp          E_pair         E_mol          TotEng         Press
        10   1.9986478     -0.38874135     0              2.6092246      7.3216782
       100   1.9838512     -0.37184965     0              2.6039215      7.3625427
       200   1.9866877     -0.37227117     0              2.6077547      7.3606555
       300   1.9861461     -0.35463178     0              2.6245817      7.4090507
       400   1.9953463     -0.37023171     0              2.622782       7.3629617
       500   2.0061948     -0.38123724     0              2.6280492      7.3411057
       600   1.9888506     -0.37910347     0              2.6041667      7.3414455
       700   2.0001109     -0.37690831     0              2.6232523      7.3446158
       800   2.0084155     -0.38964792     0              2.6229695      7.3212755
       850   2.0003566     -0.3703234      0              2.6302058      7.3599885
```