Distribute Files: Create Folders And Batch Files With Script

by Henrik Larsen 61 views

Hey guys! Ever found yourself swimming in a sea of files in one giant folder and thought, "There has to be a better way to organize this?" Well, you're in luck! Today, we're diving into a super practical shell scripting solution for batching files into multiple subfolders, ensuring each folder gets a fair share. This is incredibly useful for anyone dealing with large datasets, processing files in parallel, or simply wanting a cleaner file management system.

The Problem: One Giant Folder to Rule Them All

Imagine you have a massive directory filled with countless files – images, documents, logs, you name it. Working with such a huge collection can be a real pain. Processing these files one by one can be slow, and it's just generally messy. That’s where creating subfolders comes in handy. But, you don’t just want to dump files randomly; you want to distribute them evenly across a specified number of folders.

Distributing files evenly across multiple folders is essential for maintaining a balanced workload when processing them. For example, if you're running a script that analyzes each file, you'll want to divide the files into batches so that each batch can be processed independently, potentially on different cores or machines. This can significantly speed up your workflow. Furthermore, organized file structures make it easier to manage and locate specific files, reducing the chaos and making your data more accessible. So, how do we achieve this magical file distribution?

The Solution: A Shell Scripting Symphony

Let’s break down the shell script that can automate this process. We'll go step-by-step, so even if you're not a bash wizard yet, you’ll be able to follow along. We'll cover the core components, explain what each part does, and then put it all together into a working script. By the end, you'll not only have a solution for your current file organization woes but also a solid foundation for future scripting endeavors.

Step 1: Setting the Stage – Variables and Input

First, we need to define a few variables to make our script flexible and reusable. These variables will specify the source directory (where your files are), the destination directory (where the subfolders will be created), and the number of subfolders you want. We'll also handle user input to make the script interactive.

#!/bin/bash

# Source directory (where your files are located)
SOURCE_DIR="/path/to/your/source/directory"

# Destination directory (where subfolders will be created)
DEST_DIR="/path/to/your/destination/directory"

# Number of subfolders to create
NUM_FOLDERS=10

# Prompt the user for input if variables are not set
if [ -z "$SOURCE_DIR" ]; then
  read -p "Enter the source directory: " SOURCE_DIR
fi

if [ -z "$DEST_DIR" ]; then
  read -p "Enter the destination directory: " DEST_DIR
fi

if [ -z "$NUM_FOLDERS" ]; then
  read -p "Enter the number of subfolders: " NUM_FOLDERS
fi

# Ensure the destination directory exists
mkdir -p "$DEST_DIR"

In this snippet:

  • #!/bin/bash tells the system to use bash to execute the script.
  • We define SOURCE_DIR, DEST_DIR, and NUM_FOLDERS as variables. Replace the placeholder paths with your actual directories.
  • We use if [ -z "$VARIABLE" ]; then to check if a variable is empty. If it is, we prompt the user to enter a value using read -p. This makes the script more user-friendly.
  • mkdir -p "$DEST_DIR" ensures that the destination directory exists. The -p option creates parent directories if they don't exist, which is super handy.

Step 2: Counting Files and Calculating Distribution

Next, we need to count the total number of files in the source directory and calculate how many files should go into each subfolder. This ensures an even distribution.

# Count the total number of files
TOTAL_FILES=$(find "$SOURCE_DIR" -type f | wc -l)

# Calculate the number of files per folder
FILES_PER_FOLDER=$((TOTAL_FILES / NUM_FOLDERS))

# Calculate the remainder (for even distribution)
REMAINDER=$((TOTAL_FILES % NUM_FOLDERS))

echo "Total files: $TOTAL_FILES"
echo "Files per folder: $FILES_PER_FOLDER"
echo "Remainder: $REMAINDER"

Here’s what’s happening:

  • TOTAL_FILES=$(find "$SOURCE_DIR" -type f | wc -l) uses the find command to list all files (-type f) in the source directory, then pipes the output to wc -l, which counts the number of lines (i.e., files).
  • FILES_PER_FOLDER=$((TOTAL_FILES / NUM_FOLDERS)) calculates the base number of files each folder should receive using integer division.
  • REMAINDER=$((TOTAL_FILES % NUM_FOLDERS)) calculates the remainder of the division. We'll use this to distribute the extra files evenly.
  • The echo statements print these values to the console, which is useful for debugging and understanding the distribution.

Step 3: Creating the Subfolders

Now, we'll create the subfolders in the destination directory. We'll use a simple loop to create each folder with a numbered name.

# Create subfolders
for ((i=1; i<=$NUM_FOLDERS; i++))
do
  FOLDER_NAME="folder_$i"
  FOLDER_PATH="$DEST_DIR/$FOLDER_NAME"
  mkdir -p "$FOLDER_PATH"
  echo "Creating folder: $FOLDER_PATH"
done

In this loop:

  • for ((i=1; i<=$NUM_FOLDERS; i++)) is a standard for loop that iterates from 1 to the number of folders.
  • FOLDER_NAME="folder_$i" creates a folder name like "folder_1", "folder_2", etc.
  • FOLDER_PATH="$DEST_DIR/$FOLDER_NAME" constructs the full path to the subfolder.
  • mkdir -p "$FOLDER_PATH" creates the subfolder.
  • The echo statement provides feedback on the folder creation process.

Step 4: Distributing the Files

This is the heart of the script – where we actually move the files into the subfolders. We'll use a loop and the find command to iterate through the files and distribute them evenly.

# Distribute files
CURRENT_FOLDER=1
FILES_COUNT=0

find "$SOURCE_DIR" -type f | while IFS= read -r FILE
do
  # Determine the destination folder
  DEST_FOLDER="$DEST_DIR/folder_$CURRENT_FOLDER"

  # Move the file
  mv "$FILE" "$DEST_FOLDER/"
  echo "Moving file: $FILE to $DEST_FOLDER"

  # Increment the file count
  FILES_COUNT=$((FILES_COUNT + 1))

  # Check if the current folder is full
  if [[ $FILES_COUNT -ge $FILES_PER_FOLDER ]]; then
    # Add extra files to the first few folders based on the remainder
    if [[ $CURRENT_FOLDER -le $REMAINDER ]]; then
      if [[ $FILES_COUNT -ge $((FILES_PER_FOLDER + 1)) ]]; then
        CURRENT_FOLDER=$((CURRENT_FOLDER + 1))
        FILES_COUNT=0
      fi
    else
      CURRENT_FOLDER=$((CURRENT_FOLDER + 1))
      FILES_COUNT=0
    fi
  fi

  # Reset folder counter if it exceeds the number of folders
  if [[ $CURRENT_FOLDER -gt $NUM_FOLDERS ]]; then
    CURRENT_FOLDER=1
  fi
done

echo "Files distributed successfully!"

Let's break this down:

  • CURRENT_FOLDER=1 and FILES_COUNT=0 initialize variables to keep track of the current folder and the number of files moved into it.
  • find "$SOURCE_DIR" -type f | while IFS= read -r FILE is a powerful construct. It finds all files in the source directory and reads them one by one into the FILE variable. IFS= read -r FILE is used to handle filenames with spaces and backslashes correctly.
  • DEST_FOLDER="$DEST_DIR/folder_$CURRENT_FOLDER" determines the destination folder for the current file.
  • mv "$FILE" "$DEST_FOLDER/" moves the file to the destination folder.
  • FILES_COUNT=$((FILES_COUNT + 1)) increments the file count for the current folder.
  • The if [[ $FILES_COUNT -ge $FILES_PER_FOLDER ]]; then block checks if the current folder is full. If it is, it increments the CURRENT_FOLDER to move to the next folder and resets FILES_COUNT.
  • The nested if [[ $CURRENT_FOLDER -le $REMAINDER ]]; then block handles the remainder files. It adds an extra file to the first few folders until the remainder is exhausted.
  • if [[ $CURRENT_FOLDER -gt $NUM_FOLDERS ]]; then CURRENT_FOLDER=1; fi resets the folder counter if it exceeds the number of folders, ensuring the files are distributed cyclically.

Step 5: Putting It All Together – The Complete Script

Now, let's combine all the pieces into a single, runnable script:

#!/bin/bash

# Source directory (where your files are located)
SOURCE_DIR="/path/to/your/source/directory"

# Destination directory (where subfolders will be created)
DEST_DIR="/path/to/your/destination/directory"

# Number of subfolders to create
NUM_FOLDERS=10

# Prompt the user for input if variables are not set
if [ -z "$SOURCE_DIR" ]; then
  read -p "Enter the source directory: " SOURCE_DIR
fi

if [ -z "$DEST_DIR" ]; then
  read -p "Enter the destination directory: " DEST_DIR
fi

if [ -z "$NUM_FOLDERS" ]; then
  read -p "Enter the number of subfolders: " NUM_FOLDERS
fi

# Ensure the destination directory exists
mkdir -p "$DEST_DIR"

# Count the total number of files
TOTAL_FILES=$(find "$SOURCE_DIR" -type f | wc -l)

# Calculate the number of files per folder
FILES_PER_FOLDER=$((TOTAL_FILES / NUM_FOLDERS))

# Calculate the remainder (for even distribution)
REMAINDER=$((TOTAL_FILES % NUM_FOLDERS))

echo "Total files: $TOTAL_FILES"
echo "Files per folder: $FILES_PER_FOLDER"
echo "Remainder: $REMAINDER"

# Create subfolders
for ((i=1; i<=$NUM_FOLDERS; i++))
do
  FOLDER_NAME="folder_$i"
  FOLDER_PATH="$DEST_DIR/$FOLDER_NAME"
  mkdir -p "$FOLDER_PATH"
  echo "Creating folder: $FOLDER_PATH"
done

# Distribute files
CURRENT_FOLDER=1
FILES_COUNT=0

find "$SOURCE_DIR" -type f | while IFS= read -r FILE
do
  # Determine the destination folder
  DEST_FOLDER="$DEST_DIR/folder_$CURRENT_FOLDER"

  # Move the file
  mv "$FILE" "$DEST_FOLDER/"
  echo "Moving file: $FILE to $DEST_FOLDER"

  # Increment the file count
  FILES_COUNT=$((FILES_COUNT + 1))

  # Check if the current folder is full
  if [[ $FILES_COUNT -ge $FILES_PER_FOLDER ]]; then
    # Add extra files to the first few folders based on the remainder
    if [[ $CURRENT_FOLDER -le $REMAINDER ]]; then
      if [[ $FILES_COUNT -ge $((FILES_PER_FOLDER + 1)) ]]; then
        CURRENT_FOLDER=$((CURRENT_FOLDER + 1))
        FILES_COUNT=0
      fi
    else
      CURRENT_FOLDER=$((CURRENT_FOLDER + 1))
      FILES_COUNT=0
    fi
  fi

  # Reset folder counter if it exceeds the number of folders
  if [[ $CURRENT_FOLDER -gt $NUM_FOLDERS ]]; then
    CURRENT_FOLDER=1
  fi
done

echo "Files distributed successfully!"

To use this script:

  1. Save it to a file, for example, distribute_files.sh.
  2. Make it executable: chmod +x distribute_files.sh.
  3. Run it: ./distribute_files.sh.
  4. The script will prompt you for the source directory, destination directory, and the number of subfolders if you haven't set the variables in the script.

Conclusion: Taming the File Chaos

So there you have it! A shell script that can automatically create subfolders and distribute files evenly. This is a powerful tool for anyone dealing with large numbers of files. Whether you're a data scientist, a system administrator, or just someone who likes to keep their files organized, this script can save you a ton of time and effort.

By breaking down the problem into smaller steps and using the right tools, even complex tasks like file distribution become manageable. And remember, scripting is all about practice. The more you experiment and build, the more comfortable you'll become. So go ahead, give this script a try, and start taming that file chaos!

Happy scripting, guys! And remember, a well-organized file system is a happy file system!