Django Step by Step in Windows_Simple App to Process and Download Data

After the previous two sessions, I finally tried to create a barebone website (cleansite) where the user can upload a csv file, and have the algo to process data in back end, and then download it to the local driver. I name it as solution. Each file, its location and particularly the error-prone steps are being explained in details:

At the clenasite level, Django provides settings.py and urls.py. Note in settings.py file, in addition to add the add name in installed_apps, the last lines on storage of static files are often supplemented with

MEDIA_URL = ‘/media/’
MEDIA_ROOT = os.path.join(BASE_DIR, ‘media’)

While the urls.py key part is

urlpatterns = [
path(‘admin/’, admin.site.urls),
path(”, include(‘solution.urls’)),
]

At the solution level, there is another urls.py file

from django.urls import path
from . import views
urlpatterns = [
	path('', views.home, name='home')
]

To handle files, we need to modify the models.py as

from django.db import models
# Create your models here.
class Document(models.Model):
#    description = models.CharField(max_length=255, blank=True)
    document = models.FileField()

and then the forms.py file

from django import forms
from solution.models import Document
class DocumentForm(forms.ModelForm):
    class Meta:
        model = Document
        fields = ('document',)

The html file to lay out the look is designed the most plain way as

<!DOCTYPE html&gt;
<html&gt;
    <head&gt;
        <meta charset="utf-8" /&gt;
        <title&gt;Minimal Django Solution</title&gt;
        <script&gt;document.write('<base href="' + document.location + '" /&gt;');</script&gt;
        <!-- Latest compiled and minified CSS --&gt;
    <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css"&gt;

    <!-- jQuery library --&gt;
    <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.3.1/jquery.min.js"&gt;</script&gt;

    <!-- Popper JS --&gt;
    <script src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.7/umd/popper.min.js"&gt;</script&gt;

    <!-- Latest compiled JavaScript --&gt;
    <script src="https://maxcdn.bootstrapcdn.com/bootstrap/4.3.1/js/bootstrap.min.js"&gt;</script&gt;
    <link href="https://maxcdn.bootstrapcdn.com/font-awesome/4.7.0/css/font-awesome.min.css" rel="stylesheet"&gt;
    </head&gt;

    <body&gt;
    <p&gt;Uploaded files:</p&gt;  
 
    {% if documents %}  
    
      <ul&gt;
        {% for obj in documents %}
          <li&gt;
            <a href="{{ obj.document.url }}"&gt;{{ obj.document.name }}</a&gt;
          </li&gt;
        {% endfor %}
      </ul&gt;
    {% else %}
      <p&gt;No documents.</p&gt;
    {% endif %}
    
    {% block content %}
    <form method="post" enctype="multipart/form-data"&gt;
    {% csrf_token %}
    {{ form.as_p }}
    <p&gt;{{ form.non_field_errors }}</p&gt;
    
    <input type="file" name="myfile"&gt;
    <button type="text"&gt;Download</button&gt;
    
    <p&gt;
    </p&gt;
    <button type="text"&gt;Threshold</button&gt;
    <input type="text" name="threshold" style="height: 44px; border: 1px solid black; "&gt;

   </form&gt;
    
    {% if uploaded_file_url %}
    <p&gt;File uploaded at: <a href="{{ uploaded_file_url }}"&gt;{{ uploaded_file_url }}</a&gt;</p&gt;
    {% endif %}
    
    <p&gt; </p&gt;
    <p&gt;test download here</p&gt;
        

    {% if downloaded_file_url %}
    <p&gt;File Downloaded at: <a href="{{ downloaded_file_url }}"&gt;{{ downloaded_file_url }}</a&gt;</p&gt;
    {% endif %}  
    
    
    <p&gt;<bold&gt;myfile exist at:</bold&gt;{{ testcsv }}</p&gt;
    
    
    {% endblock %}
    
    </body&gt;

</html&gt;

What’s worth elaboration is that within the form, we can insert multiple input buttons with method=”post” enctype=”multipart/form-data”>. Moreover, the design of a button “Download” in correspondence to the data fed in and later on processed is quite inconspicuous .

views.py is the most centerpiece of the codes,


from django.shortcuts import render, redirect
from django.conf import settings
from django.core.files.storage import FileSystemStorage
from django.http import HttpResponse

from solution.models import Document
from solution.forms import DocumentForm
import pandas as pd
import numpy as np
import os
import csv

def astrip(astring):
    bs = astring.strip("\r\n")
    return bs.replace("\r\n", "")

def Capping(df, threshold):
    while (df.weight > threshold).any():
        largest = float(df.weight.nlargest(1)) 
        df['weight_1'] = 0
        df.loc[df.weight == threshold, 'weight_1'] = threshold
        # df['weight_1'][df.weight == threshold] = threshold
        num = len(df[df.weight == largest]) 
        df.loc[df.weight == largest, 'weight_1'] = threshold
        # df['weight_1'][df.weight == largest] = threshold
        dist = (largest - threshold)*num
        total = df.weight[(df.weight_1 == 0)].sum()
        df.loc[df.weight_1 == 0, 'weight_1'] = df.weight + dist*(df.weight/total)
        # df['weight_1'][df.weight_1 == 0] = df.weight + dist*(df.weight/total)
        del df['weight']
        df.rename(columns={'weight_1': 'weight'}, inplace=True)
    return df

def home(request):
    documents = Document.objects.all()
    if request.method == 'POST' and request.FILES['myfile']:
        myfile = request.FILES['myfile']
        fs = FileSystemStorage()
        filename = fs.save(myfile.name, myfile)
        uploaded_file_url = fs.url(filename)
        
        from django.core.files.storage import default_storage
        f = default_storage.open(os.path.join(settings.MEDIA_ROOT, 'test.csv'), 'rb')
        data = f.read()
        f.close()
        mydf = []
        for line in myfile:
            mydf.append(line.decode('utf-8').split(','))
        print('TEST')
        tmp = pd.DataFrame(mydf)
        tmp[tmp.columns[-1]] = tmp[tmp.columns[-1]].apply(astrip)
        
        tmp.columns = tmp.iloc[0]
        tmp = tmp[1:]
        tmp[tmp.columns[-1]] = tmp[tmp.columns[-1]].astype(float)
        print(tmp.dtypes)
        mylist = []
        for line in myfile:
            mylist.append(line.decode('UTF-8').split(','))
        tmp = pd.DataFrame(mylist)
        testcsv = tmp.iloc[1:, :]
        testcsv.columns = ['date','id','cap','weight']
        testcsv.weight = testcsv.weight.astype(float)
        t = request.POST['threshold']
        threshold = float(t)
        dates = testcsv.date.unique()
        df = pd.DataFrame()
        for day in dates:
            dfile = testcsv[testcsv.date == day]
            Capping(dfile, threshold)
            df = df.append(dfile)
        
        response = HttpResponse(content_type='text/csv')
        response['Content-Disposition'] = 'attachment; filename="output.csv"'
        
        writer = csv.writer(response)
        writer.writerow(['date','id','cap','weight'])
        l = df.values.tolist()
        for row in l:
            writer.writerow(row)
            
        return response
        
    return render(request, 'home.html')

There are several traps that are not easily get to understand: 1. the myfile characteristics, 2. byte file format in django to be converted, 3. in memory file to be loaded to a folder and leveraging httpresponse to be downloadable via a url call, 4. how to read a file already saved up in server(media folder in this simple case).

first, the myfile characteristics, after print it (print to console, it’s a great way to do testing, debugging), is ‘django.core.files.uploadedfile.InMemoryUploadedFile’>, which behaves quite differently than normal file in interactive python platform.

second, byte file need to be converted by .decode(‘UTF-8’).

third, there is detailed chapter about how to output csv file in django.

        response = HttpResponse(content_type='text/csv')
        response['Content-Disposition'] = 'attachment; filename="output.csv"'
        
        writer = csv.writer(response)
        writer.writerow(['date','id','cap','weight'])
        l = df.values.tolist()
        for row in l:
            writer.writerow(row)
            
        return response

in the official document, this sample codes are more straightforward:

import csv
from django.http import HttpResponse

def some_view(request):
    # Create the HttpResponse object with the appropriate CSV header.
    response = HttpResponse(content_type='text/csv')
    response['Content-Disposition'] = 'attachment; filename="somefilename.csv"'

    writer = csv.writer(response)
    writer.writerow(['First row', 'Foo', 'Bar', 'Baz'])
    writer.writerow(['Second row', 'A', 'B', 'C', '"Testing"', "Here's a quote"])

    return response

forth, thanks to the django functions:

    from django.core.files.storage import default_storage
    f = default_storage.open(os.path.join(settings.MEDIA_ROOT, 'test.csv'), 'rb')
    data = f.read()
    f.close()
    mydf = []
    for line in myfile:
        mydf.append(line.decode('utf-8').split(','))

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.