0

I wrote two scripts in Python:

The first appends to a list the key values of a dictionary.

The second one uses that list to create columns on a MySQL database.

Originally I wrote them in the same module. However, is it better if I separate them in two different modules or keep them together?

If it's better separating them, what is a pythonic way to use lists from one file in another? I know importing variables is not recommended.

Here is the code:

import pymysql

# This script first extract dictionary key values. Then, it creates columns using the keys values from the dictionaries.

# login into mysql
conn = pymysql.connect("localhost", "*****", "******", "tutorial")

# creating a cursor object
c = conn.cursor()

# update here table from mysql
table_name = "table01"

# data
dicts = {'id': 5141, 'product_id': 193, 'price_ex_tax': 90.0000, 'wrapping_cost_tax': 0.0000, 'type': 'physical', 'ebay_item_id': 444, 'option_set_id': 38, 'total_inc_tax': 198.0000, 'quantity': 2, 'price_inc_tax': 99.0000, 'cost_price_ex_tax': 0.0000, 'name': 'UQ Bachelor Graduation Gown Set', 'configurable_fields': [], 'base_cost_price': 0.0000, 'fixed_shipping_cost': 0.0000, 'wrapping_message': '', 'order_address_id': 964, 'total_ex_tax': 180.0000, 'refund_amount': 0.0000, 'event_name': None, 'cost_price_inc_tax': 0.0000, 'cost_price_tax': 0.0000, 'wrapping_cost_inc_tax': 0.0000, 'wrapping_name': '', 'price_tax': 9.0000, 'is_bundled_product ': False, 'ebay_transaction_id': 4444, 'bin_picking_number': 4444, 'parent_order_product_id': None, 'event_date': '', 'total_tax': 18.0000, 'wrapping_cost_ex_tax': 0.0000, 'base_total': 198.0000, 'product_options': [{'id': 4208, 'display_name': 'Gown size (based on height)', 'name': 'Bachelor gown size', 'display_value': 'L (175-182cm)', 'display_style': 'Pick list', 'type': 'Product list', 'option_id': 19, 'value': 77, 'product_option_id': 175, 'order_product_id': 5141}, {'id': 4209, 'display_name': 'Hood', 'name': 'H-QLD-BAC-STD', 'display_value': 'UQ Bachelor Hood', 'display_style': 'Pick list', 'type': 'Product list', 'option_id': 42, 'value': 119, 'product_option_id': 176, 'order_product_id': 5141}, {'id': 4210, 'display_name': 'Trencher size (based on head circumference)', 'name': 'Trencher size', 'display_value': 'M (53-54cm)', 'display_style': 'Pick list', 'type': 'Product list', 'option_id': 20, 'value': 81, 'product_option_id': 177, 'order_product_id': 5141}], 'base_price': 99.0000, 'sku': 'S-QLD-BAC-STD', 'return_id': 0, 'applied_discounts': [{'id': 'coupon', 'amount': 30}], 'quantity_shipped': 0, 'base_wrapping_cost': 0.0000, 'is_refunded': False, 'weight': 2.0000, 'order_id': 615496}  # noqa

# creating empty lists
int_keys_lists = []
str_keys_lists = []
list_keys_lists = []


def extractDictKeys():

    # for loop that runs through the dictionary, extracting the keys when their valures are int or str, and appending to the corresponding list
    for i, j in enumerate(dicts):
        k, v = list(dicts.items())[i]
        if type(dicts[j]) != str:
            int_keys_lists.append(k)
        else:
            str_keys_lists.append(k)


def createColumnStrKeys():
    # for loop that create a column for each str key on the list
    for i, j in enumerate(str_keys_lists):
        c.execute("ALTER TABLE {0} ADD COLUMN {1} VARCHAR(255)".format(table_name, str_keys_lists[i]))
        conn.commit()


def createColumnIntKeys():
    # for loop that create a column for each id or float key on the list
    for i, j in enumerate(int_keys_lists):
        c.execute("ALTER TABLE {0} ADD COLUMN {1} int(30)".format(table_name, int_keys_lists[i]))
        conn.commit()

extractDictKeys()
createColumnStrKeys()
createColumnIntKeys()
  • Before addressing your question, it's worth pointing out that you are demonstrating a serious no no using format() to dynamically construct your sql. While in this case it's probably not a big deal, in many others it can leave you wide open to a type of attack known as sql injection. I'm not familiar with mysql, but I'd bet money there is a way to pass a format string and a list of params into the execute function itself. That's the right way to do it. – Sean Azlin Feb 11 '15 at 07:26
  • I don't think I follow you. The format() function is using values from a list that I know it's made up of strings. – Filipe Barreto Peixoto Teles Feb 11 '15 at 07:33
  • the answer to the following question has a good example: http://stackoverflow.com/questions/775296/python-mysql-with-variables – Sean Azlin Feb 11 '15 at 07:53
  • My point is that there is never a reason to construct your sql this way, and there is a damn good reason to never do it that way. Even if you think it's safe right now you should simply never do it. In case it helps, here's a good example and walkthrough of sql injection: http://www.programmerinterview.com/index.php/database-sql/sql-injection-example/ – Sean Azlin Feb 11 '15 at 08:00

2 Answers2

0

There are some problems in your design.

Functions should not be using global variables. Functions receive variables as parameters, and then return a value (the value is void in some cases, though)

For example:

def extract_dict_keys(dictionary):
    key_list = list()
    int_key_list = list()
    str_key_list = list()
    # Your code here
    ..........
    return key_list, int_key_list, str_key_list

def create_col__str_keys(conn, dictionary, key_list):
    cursor = conn.cursor()
    # Your code here
    ..........
    # Commit outside of the loop
    conn.commit() # Though it's usually not recommended to commit in the function

def create_col__int_keys(conn, dictionary, key_list):
    cursor = conn.cursor()
    # Your code here
    ..........
    conn.commit()

Only after you have made all your pieces of code independent with each other, Should you be able to structure them into modules

How to structure your code into module depends on how much the codes are reusable, and how do they relate to each other. For example, I would put all global variables into a main file that would be executed, utility functions into another module, and sql-related functions into another one:

main.py

from utilities import extract_dict_keys
from sql import create_col_str_keys, create_col_int_keys

# login into mysql
conn = pymysql.connect("localhost", "*****", "******", "tutorial")

# data
data = {...}  # noqa
keys, int_keys, str_keys = extract_dict_keys(data)
create_col_int_keys(conn, data, int_keys)
create_col_str_keys(conn, data, str_keys)

utilities.py

def extract_dict_keys(dictionary):
    key_list = list()
    int_key_list = list()
    str_key_list = list()
    # Your code here
    ..........
    return key_list, int_key_list, str_key_list

sql.py

import pymysql

def create_col__str_keys(conn, dictionary, key_list):
    cursor = conn.cursor()
    # Your code here
    ..........
    conn.commit()

def create_col__int_keys(conn, dictionary, key_list):
    cursor = conn.cursor()
    # Your code here
    ..........
    conn.commit()

Also, concerning the SQL Injection vulnerability, you should not use cursor.execute(a_built_string)

Consider this situation:

cursor.execute("DELETE FROM users WHERE id = {uid}".format(uid=user_id))

And someone clever enough to input user_id="1 or 1 = 1". The result would be:

cursor.execute("DELETE FROM users WHERE id = 1 or 1 = 1")

Which will probably delete all users from your precious users table.

Therefore, use this instead:

cursor.execute("DELETE FROM users WHERE id = %d", user_id)

It will not create a string, compile and execute it, but compile the SQL statement, then insert the variable, then execute it, which is a lot safer

Quan To
  • 697
  • 3
  • 10
0

There are two things I'd suggest researching a bit before asking yourself this question.

First, you're relying on a bunch of globals in module scope. This problem you're seeing of making those globals visible in other modules is only going to get worse as you write more complex programs. You should start thinking about involving classes a bit, importing those, instantiating them and looking at design patterns such as singletons. It's much easier and cleaner to define a bunch of important lists, variables, etc. under classes and then import those and pass instances of them around between functions. Being object-oriented is pythonic.

Second, your functions are conceptually looking more like subroutines. It would better to think of functions as... well... functions. They should take some arguments (inputs) and produce (ie. return) some output. Steer away from trying to manipulate globals, which can easily become problematic, and think more about using inputs to create and return output.

Research and combine these concepts and the answer to where code should live will become more clear.

Sean Azlin
  • 886
  • 7
  • 21