Krzysztof Żuraw


Personal site

Transcoding with AWS- part one

Nowadays moving everything to the cloud becomes more and more popular. A lot of software companies move their technology stack to such infrastructure. One of the biggest players in this field is Amazon Web Services - AWS. That's why I decided decided to adapt existing code from my previous project and move transcoding to write blog posts about that. In this series I process to cloud.

Overview of series

I decided to adapt code from my previous blog series about celery and rabbit-mq. I did that because code from this django application actually transcodes mp3 files to other formats. This series will be divided into these parts:

  • Moving static and media files to AWS
  • Transcoding files inside AWS transcoder
  • Notifying user that transcode is complete
  • User downloads transcoded file

Moving static and media files to AWS

AWS transcoder operates only on files that are inside S3 bucket so first I need to change how these files are served in django.

Let's say that I already had my account on AWS. Next step is to generate specific account using IAM. While creating a user I want to give him access to AWS S3:

Policy for IAM user

and after I download its credentials I can create S3 container - I have chosen Ireland because with Frankfurt I have a problem with uploading files to S3. After bucket creation, it's time to add policy. The policy is basically JSON that tells AWS which user can access given bucket. More information about that can be found here. Adding policy is quite simple from S3 management view:

Policy for S3 bucket

This policy looks like this:

{
      "Version": "2008-10-17",
      "Statement": [
              {
                      "Sid": "PublicReadForGetBucketObjects",
                      "Effect": "Allow",
                      "Principal": {
                              "AWS": "*"
                      },
                      "Action": "s3:GetObject",
                      "Resource": "AWS_RESOURCE"
              },
              {
                      "Effect": "Allow",
                      "Principal": {
                              "AWS": "AWS_PRINCIPAL"
                      },
                      "Action": "s3:*",
                      "Resource": [
                              "AWS_RESOURCE",
                      ]
              }
      ]
}

Where AWS_PRINCIPAL is your IAM user in format of ARN resource: "arn:aws:iam::AMAZON_ID:user/IAM_USER" and AWS_RESOURCE: arn:aws:s3:::S3_BUCKET_NAME/*.

As I have this set up right now I can make changes in the application itself. To use S3 as a static and media files container I used django-storages. To make django-storages to work I have to add couple things in settings.py:

import environ
env = environ.Env()


INSTALLED_APPS = [
  # other applicaitons
  'storages',
]

AWS_STORAGE_BUCKET_NAME = env('AWS_STORAGE_BUCKET_NAME')
AWS_ACCESS_KEY_ID = env('AWS_ACCESS_KEY_ID')
AWS_SECRET_ACCESS_KEY = env('AWS_SECRET_ACCESS_KEY')
AWS_S3_CUSTOM_DOMAIN = '%s.s3.amazonaws.com' % AWS_STORAGE_BUCKET_NAME
AWS_HEADERS = {
    'Expires': 'Thu, 15 Apr 2010 20:00:00 GMT',
    'Cache-Control': 'max-age=86400',
}
AWS_S3_HOST = 's3-eu-west-1.amazonaws.com'

I'm using here another package called django-environ. It allows me to get certain settings from environmental variables. I'm setting them in my virtualenvwrapper script inside $ENV_PATH/bin/postactivate:

export AWS_STORAGE_BUCKET_NAME='name'
export AWS_ACCESS_KEY_ID='key'
export AWS_SECRET_ACCESS_KEY='acces_id'

The last line with AWS_S3_HOST is really important here as boto - client that django-storages use underneath to connect to AWS doesn't have default region set up. If this is not specified I upload files with redirection which don't allow to transfer static files or upload any large media file.

As I have AWS settings set up there is time to change static files settings in settings.py:

STATICFILES_LOCATION = 'static'
STATIC_URL = "https://%s/%s/" % (AWS_S3_CUSTOM_DOMAIN, STATICFILES_LOCATION)
STATICFILES_STORAGE = 'audio_transcoder.storages.StaticStorage'
STATICFILES_DIRS = (
  os.path.join(BASE_DIR.root, 'static'),
)

I add custom StaticStorage as I want my static files to be under static in S3 bucket:

from django.conf import settings
from storages.backends.s3boto import S3BotoStorage


class StaticStorage(S3BotoStorage):
  location = settings.STATICFILES_LOCATION

To upload my static files I simply run python manage.py collectstatic. After a while I can see that my files are in a bucket:

Static files inside S3

Right now when I run my server I can see the location of my static files:

Static files loaded from S3

As static files are working it's high time to use AWS for media files. Right now it's simple - in settings I add:

MEDIAFILES_LOCATION = 'media'
MEDIA_URL = "https://%s/%s/" % (AWS_S3_CUSTOM_DOMAIN, MEDIAFILES_LOCATION)
DEFAULT_FILE_STORAGE = 'audio_transcoder.storages.MediaStorage'

with custom storage:

class MediaStorage(S3BotoStorage):
  location = settings.MEDIAFILES_LOCATION

Now when I upload my mp3 file it's sent directly to S3 bucket under media location:

Media files in S3

That's all for today! In the next blog post, I will write about how to set up AWS transcoder.

The code that I have made so far is available on github. Stay tuned for next blog post from this series.

Special thanks to Kasia for being an editor for this post. Thank you.

While creating this blog post I used an excellent tutorial from cactus group.

Cover image by Harald Hoyer under CC BY-SA 2.0, via Wikimedia Commons

To see comments and full article enter: Transcoding with AWS- part one

Docker.py- python API for Docker

Once upon a time I and my friend decided to write an application that helps us doing code kata. The first problem that we faced was how to run a code provided by the user in a safe manner so our server won't be destroyed. After giving it some thought I decided to write a prototype of an application that runs the code inside Docker container which is immediately destroyed after the code has been run. This blog post is about this prototype.

Table of Contents:

Assumptions

I need an application that gets a code from the user, executes it and gives output back. As many people before me said output from user cannot be trusted so I need to use some kind of container for user input. To do that I used Docker python API- docker.py. Using that and Flask I created Tdd-app-prototype. Under the hood, this application will work like this: user writes a code on a website, clicks submit. Then Docker creates a container based on python docker image and executes code. I take the output from the container and destroy it afterwards.

As we know what application should do, let's jump into the code.

Code

The first problem that I have is that I don't want to write a code provided by the user to a disk, then read it from the disk and it execute by Docker. I want to store it in memory - perfect case for StringIO. Code that does this looks as follows:

@app.route("/send_code", methods=['POST'])
def execute_code():
  data = request.form['source_code']
  code = io.StringIO(data)
  create_container(code)
  output = get_code_from_docker()
  return output

Here beside specifying routes in Flask I take data from the form, cast it to StringIO and create a container from that code. Function that does that is below:

def create_container(code):
  cli.create_container(
       image='python:3',
       command=['python','-c', code.getvalue()],
       name='tdd_app_prototype',
  )

What is cli here? I can use docker.py with Docker from other than my own computer location so before I can use any of these functions I need to specify Client:

cli = Client(base_url='unix://var/run/docker.sock')

It tells docker.py to use my local Docker. Let's go back to create_container. I tell docker.py to use official python 3 images. Then I specify a command to run: python -c and my code from StringIO. If you want to run standalone python script you can use this:

def create_container(code):
  cli.create_container(
       image='python:3',
       command=['python','-c', 'my_code.py'],
       volumes=['/opt'],
       host_config=cli.create_host_config(
           binds={ os.getcwd(): {
               'bind': '/opt',
               'mode': 'rw',
               }
           }
       ),
       name='tdd_app_prototype',
       working_dir='/opt'
  )

volumes and host_config keywords are for telling Docker to mount volumes. It is the same as running docker run -v "$PWD":/opt. Finally I set up working_dir so I don't need to provide a full path to my_code.py. As we have a container created now it is time to start it:

def get_code_from_docker():
  cli.start('tdd_app_prototype')
  cli.wait('tdd_app_prototype')
  output = cli.logs('tdd_app_prototype')

  cli.remove_container('tdd_app_prototype', force=True)

  return "From docker: {}".format(output.strip())

I used here wait so I wait for the container to stop. Then I take output in form of lists and remove the container. Right now it looks as follows:

That's all for today! If you want to see full code grab it here. Do you know other ways of using docker.py? Please leave a comment.

Special thanks to Kasia for being editor for this post. Thank you.

Cover image by Gabriel Barathieu under CC BY-SA 2.0, via Wikimedia Commons

To see comments and full article enter: Docker.py- python API for Docker

Django Grils- Kraków #3

As I said many times on this blog I really like teaching others so I can improve myself. That's why when I heard about Django Girls Kraków I didn't hesitate and I joined this event as a coach. This is short recap from Django Girls Kraków #3.

Installation party

The main event was held on Saturday but the day before there was a small installation party when for two hours girls were installing necessary tools for workshops such as python, django virtualenv and git. When it comes to my team there were 3 girls on it: Joanna, Olga and Magda. Before the Django Girls organizators came up with a wonderful idea that to get to know everyone in the team a little bit better, every person has to write a few sentences about themselves. Thanks to that there were already conversation starters. The installation went well without any major problems (considered that girls used Windows). After the installation party there was a pleasant surprise - dinner for coaches to thank for their work. Super cool!

Workshop day

Workshops started early - at 9 am. Girls started working on django girls tutorial. I decided to do the same - if I hadn't done this I wouldn't have known where problems could have occurred. Fortunately there is a windows virtual machine image so I could work in the same environment as my protégés. I have to say that the whole workshop lasted 10 hours and it was really demanding to be focused during this whole time. Because of that, there were breaks for lunch or contest. What is more, a small session of lightning talks was held. I have to say girls did an amazing job so I didn't have much to do but some problems were not trivial. I learn not to do something called 'dive back driving' so doing everything rather than letting girls do it and learn along the way. I also have to train my patience.

Conclusion

I learnt quite a few things during these workshops. But I wanted to thank organizators for their hard work on this event. Moreover, I wanted to thank my protégés - girls you did an amazing job- keep it up!

Special thanks to Kasia for being editor for this post. Thank you.

Cover photo taken by me during Django Girls Kraków #3.

To see comments and full article enter: Django Grils- Kraków #3

JSON Web Tokens in django application- part four

When I started this series I have got one comment from my co-worker that instead of authentication JWT can be used to sign one time links. After reading through the documentation I found that can be a great idea so I decided to write a blog post about it.

Use case

Nowadays when a user creates an account he or she has to confirm identity. It is done by sending an email with the link to confirm and activate an account.

As this link has to expire and be safe this is a good use case for using JSON Web Tokens. Such tokens can be generated for every user and set to expire for example after two hours. How can it be done in Django? Let's jump into the code.

JSON Web Tokens in urls

First I change the previous code from series and made special django app just for users. But the first user has to register - that's why I made new endpoint in urls.py:

from users.views import UserViewSet, CreateUserView,
urlpatterns = [
    # rest of url patterns
    url('^api-register/$', CreateUserView.as_view()),
]

CreateUserView looks as follows:

from rest_framework import status
from rest_framework.generics import CreateAPIView
from rest_framework.response import Response
from rest_framework_jwt.settings import api_settings

jwt_payload_handler = api_settings.JWT_PAYLOAD_HANDLER
jwt_encode_handler = api_settings.JWT_ENCODE_HANDLER

class CreateUserView(CreateAPIView):

    model = User.objects.all()
    permission_classes = [
        permissions.AllowAny # Or anon users can't register
    ]
    serializer_class = UserSerializer

    def create(self, request, *args, **kwargs):
        serializer = self.get_serializer(data=request.data)
        serializer.is_valid(raise_exception=True)
        self.perform_create(serializer)
        headers = self.get_success_headers(serializer.data)
        user = self.model.get(username=serializer.data['username'])
        payload = jwt_payload_handler(user)
        token = jwt_encode_handler(payload)
        return Response(
            {
                'confirmation_url': reverse(
                    'activate-user', args=[token], request=request
                )
            },
            status=status.HTTP_201_CREATED, headers=headers
        )

In this view, I simply add few additional lines for creating JWT. Rest of this is a standard code from DRF. First I created payload by adding user to JWT creation process, then I created the token from payload by calling jwt_encode_handler. At the end instead of returning user data, I return confirmation_url for the end user to enter and activate the account. By default django make every user active so I have to write my own create method for UserSerializer:

from django.contrib.auth.models import User
from rest_framework import serializers
from tasks.models import Task

class UserSerializer(serializers.ModelSerializer):
    tasks = serializers.PrimaryKeyRelatedField(
        many=True, queryset=Task.objects.all()
    )

    class Meta:
        model = User
        fields = ('username', 'password', 'tasks', 'email')

    def create(self, validated_data):
        user = User(
            email=validated_data['email'],
            username=validated_data['username']
        )
        user.set_password(validated_data['password'])
        user.is_active = False
        user.save()
        return user

It is simply for setting user as inactive during the process of account creation. Right now when user wants to create an account he/she has to send the following request:

$ http POST 127.0.0.1:9000/api-register/ username=krzysiek password=krzysiek email=krzysztof@kz.com
HTTP/1.0 201 Created
Allow: POST, OPTIONS
Content-Type: application/json
Date: Sun, 13 Nov 2016 15:16:33 GMT
Server: WSGIServer/0.2 CPython/3.5.2
Vary: Accept
X-Frame-Options: SAMEORIGIN

{
    "confirmation_url": "http://127.0.0.1:9000/api-activate/eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJlbWFpbCI6ImtyenlzenRvZkBrei5jb20iLCJ1c2VyX2lkIjoyNSwidXNlcm5hbWUiOiJrcnp5c2llayIsImV4cCI6MTQ3OTA1MDQ5M30.CMcW8ZtU6AS9LfVvO-PoLyqcwi6cOK1VzI2o7pEPX2k/"
}

How this confirmation_url works? I made additional urlpattern:

from users.views import ActivateUser

urlpatterns = [
    # rest of url patterns
    url(
        '^api-activate/(?P<token>.+?)/$',
        ActivateUser.as_view(),
        name='activate-user'
    ),
]

and in ActivateUser:

class ActivateUser(APIView):

    def get(self, request, *args, **kwargs):
        token = kwargs.pop('token')
        try:
            payload = jwt_decode_handler(token)
        except jwt.ExpiredSignature:
            msg = _('Signature has expired.')
            raise exceptions.AuthenticationFailed(msg)
        except jwt.DecodeError:
            msg = _('Error decoding signature.')
            raise exceptions.AuthenticationFailed(msg)
        except jwt.InvalidTokenError:
            raise exceptions.AuthenticationFailed()

        user_to_activate = User.objects.get(id=payload.get('user_id'))
        user_to_activate.is_active = True
        user_to_activate.save()

        return Response(
            {'User Activated'},
            status=status.HTTP_200_OK
        )

This is generic APIView so I write get method for handling GET requests. I was wondering if it's a good idea to activate user in GET request or do it in PUT. If you have some thoughts about this I will be happy to hear them. In get I simply take the token from kwargs and perform validation on that token - if it's valid or expired. This part of code usually lies in authentication backend but in such class I don't have access to url of a request so in this case, I have to implement this in such a way. If you have other ways of handling such a case please let me know! So if everything looks good I activate user:

$ http GET http://127.0.0.1:9000/api-activate/eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJlbWFpbCI6ImtyenlzenRvZkBrei5jb20iLCJ1c2VyX2lkIjoyNSwidXNlcm5hbWUiOiJrcnp5c2llayIsImV4cCI6MTQ3OTA1MDQ5M30.CMcW8ZtU6AS9LfVvO-PoLyqcwi6cOK1VzI2o7pEPX2k/
HTTP/1.0 200 OK
Allow: GET, HEAD, OPTIONS
Content-Type: application/json
Date: Sun, 13 Nov 2016 15:17:37 GMT
Server: WSGIServer/0.2 CPython/3.5.2
Vary: Accept
X-Frame-Options: SAMEORIGIN

[
    "User Activated"
]

$ http GET http://127.0.0.1:9000/api-activate/eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJlbWFpbCI6ImtyenlzenRvZkBrei5jb20iLCJ1c2VyX2lkIjoyNSwidXNlcm5hbWUiOiJrcnp5c2llayIsImV4cCI6MTQ3OTA1MDQ5M30.CMcW8ZtU6AS9LfVvO-PoLyqcwi6cOK1VzI2o7pEPX2k/
 HTTP/1.0 401 Unauthorized
 Allow: GET, HEAD, OPTIONS
 Content-Type: application/json
 Date: Sun, 13 Nov 2016 15:28:00 GMT
 Server: WSGIServer/0.2 CPython/3.5.2
 Vary: Accept
 WWW-Authenticate: JWT realm="api"
 X-Frame-Options: SAMEORIGIN

 {
     "detail": "Signature has expired."
 }

By default django rest framework jwt sets token expiry time to 5 minutes. If you want to change that add following lines in settings.py:

JWT_AUTH = {
     'JWT_EXPIRATION_DELTA': datetime.timedelta(seconds=7)
}

That's all for today! Feel free to comment and check repo for this blog post under this link.

To see comments and full article enter: JSON Web Tokens in django application- part four

Django Under The Hood 2016 recap

From the beginning I really wanted to contribute to Django. I asked a friend of mine- "Do you know where I can start contributing?" She answers- "Go to Django Under The Hood". So I went. This is my small recap of this very event.

Day one

After wandering a little bit around the city I finally got to the venue and the talks started- the first one was Channels by Andrew Godwin. Until then I had heard about this topic but I hadn't really go into details for what it is useful for. Andrew presented a very thought-through understanding of what channels really are and for what they can be used. But I would like to see them in production to see how this gonna work. As a guy who hadn't heard about this topic before I liked it very much.

Right after that was a talk about testing by Ana Balica. She started by introducing about how testing in django evolved which I really liked. Then there was an introduction what is happening when you execute test suite via django. And what is happening in various testcases classes and clients in Django. I really liked the segment about tools that you can use to exhance your testing and 8 tips on how to speed up tests. Another really interesting thing. You can find slides here.

The last talk on this day was debugging by Aymeric Augustin. It was a talk about how to speed up your page load. As it turns out backend is responsible for only 20% of page load. Good thing to consider when improving performance. To speed your page load you should start by improving your frontend and then go to the backend. When it comes to backend I heard some interesting ideas on how to improve performance.

Day two

The second day started with a keynote by Jennifer Akullian. It was a talk about mental health in IT. I found this topic really interesting and I was happy that it has been raised.

Next talk was a more technical one about validation by Loïc Bistuer. It was a really interesting talk about forms and validation. It was deeply technical which sometimes for me was difficult to understand but it is very good- when something isn't comfortable you don't learn.

Then there was a talk about javascript by Idan Gazit. It was a talk that gave me a lot because of my rising interest in JavaScript. I heard about various tools and what it means to write modern javascript. I also heard about promises- the thing that is right now on top in javascript world so I heard it every other talk from this subject :). But overally talk gave me a lot of information that I can use further.

Next one was a database backends talk by Michael Manfre. It was diving deep into django ORM to show how to develop new database backend for Microsoft MSSQL. A lot of useful info.

After a coffee break, there was a talk about open source founding by Nadia Eghbal. Nice talk about what it means to find founder for open source projects and what challenges you may have along the way

The last talk was about Instagram and how it uses django by Carl Meyer. It was amazing talk! I really liked how they evolved and what was replaced or improved along the way. The funny thing was about Justin Bieber- his photos (especially likes to this photos) heat up the postgres database. I enjoyed the way the instagram handle performance.

Day three & four

As the talks day ended time has come for sprints! There were held in another location of Amsterdam but I found it comfortable too. Also, the experience was really nice as about 300 people were developing the same framework at the same time. At the beginning of the sprint, I decided to work on some GeoDjango stuff. I was able to close one and write some documentation. Awesome time!

Conclusion

It was a great time in Amsterdam! Talks were deeply technical and sprints productive. Superb organization. Highly recommended to everyone!

Cover picture taken from DUTH twitter account: Under the Hood made by Bartek.

Special thanks to Kasia for being editor for this post. Thank you.

To see comments and full article enter: Django Under The Hood 2016 recap


Page 1 / 9